[Issue]: <title> WHEN USED LOCAL OLLAMA MODEL ZeroDivisionError: Weights sum to zero, can't be normalized AND Error embedding chunk #646

zw-change · 2024-07-22T06:35:52Z

Describe the issue

;when I use the ollama local model and used the local query will make a mistake ,but global query did't have this problem

Steps to reproduce

https://github.com/TheAiSingularity/graphrag-local-ollama

GraphRAG Config Used

encoding_model: cl100k_base
skip_workflows: []
llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_chat # or azure_openai_chat
model: mistral
model_supports_json: true # recommended if this is available for your model.

max_tokens: 4000

request_timeout: 180.0

api_base: http://192.168.0.17:11434/v1

api_version: 2024-02-15-preview

organization: <organization_id>

deployment_name: <azure_model_deployment_name>

tokens_per_minute: 150_000 # set a leaky bucket throttle

requests_per_minute: 10_000 # set a leaky bucket throttle

max_retries: 10

max_retry_wait: 10.0

sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

concurrent_requests: 25 # the number of parallel inflight requests that may be made

parallelization:
stagger: 0.3

num_threads: 50 # the number of threads to use for parallel processing

async_mode: threaded # or asyncio

embeddings:

parallelization: override the global parallelization settings for embeddings

async_mode: threaded # or asyncio
llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_embedding # or azure_openai_embedding
model: nomic_embed_text
api_base: http://192.168.0.17:11434/api
# api_version: 2024-02-15-preview
# organization: <organization_id>
# deployment_name: <azure_model_deployment_name>
# tokens_per_minute: 150_000 # set a leaky bucket throttle
# requests_per_minute: 10_000 # set a leaky bucket throttle
# max_retries: 10
# max_retry_wait: 10.0
# sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times
# concurrent_requests: 25 # the number of parallel inflight requests that may be made
# batch_size: 16 # the number of documents to send in a single request
# batch_max_tokens: 8191 # the maximum number of tokens to send in a single request
# target: required # or optional

chunks:
size: 300
overlap: 100
group_by_columns: [id] # by default, we don't allow chunks to cross documents

input:
type: file # or blob
file_type: text # or csv
base_dir: "input"
file_encoding: utf-8
file_pattern: ".*\.txt$"

cache:
type: file # or blob
base_dir: "cache"

connection_string: <azure_blob_storage_connection_string>

container_name: <azure_blob_storage_container_name>

storage:
type: file # or blob
base_dir: "output/${timestamp}/artifacts"

connection_string: <azure_blob_storage_connection_string>

container_name: <azure_blob_storage_container_name>

reporting:
type: file # or console, blob
base_dir: "output/${timestamp}/reports"

connection_string: <azure_blob_storage_connection_string>

container_name: <azure_blob_storage_container_name>

entity_extraction:

llm: override the global llm settings for this task

parallelization: override the global parallelization settings for this task

async_mode: override the global async_mode settings for this task

prompt: "prompts/entity_extraction.txt"
entity_types: [organization,person,geo,event]
max_gleanings: 0

summarize_descriptions:

llm: override the global llm settings for this task

parallelization: override the global parallelization settings for this task

async_mode: override the global async_mode settings for this task

prompt: "prompts/summarize_descriptions.txt"
max_length: 500

claim_extraction:

llm: override the global llm settings for this task

parallelization: override the global parallelization settings for this task

async_mode: override the global async_mode settings for this task

enabled: true

prompt: "prompts/claim_extraction.txt"
description: "Any claims or facts that could be relevant to information discovery."
max_gleanings: 0

community_report:

llm: override the global llm settings for this task

parallelization: override the global parallelization settings for this task

async_mode: override the global async_mode settings for this task

prompt: "prompts/community_report.txt"
max_length: 2000
max_input_length: 8000

cluster_graph:
max_cluster_size: 10

embed_graph:
enabled: false # if true, will generate node2vec embeddings for nodes

num_walks: 10

walk_length: 40

window_size: 2

iterations: 3

random_seed: 597832

umap:
enabled: false # if true, will generate UMAP embeddings for nodes

snapshots:
graphml: true
raw_entities: yes
top_level_nodes: yes

local_search:

text_unit_prop: 0.5

community_prop: 0.1

conversation_history_max_turns: 5

top_k_mapped_entities: 10

top_k_relationships: 10

max_tokens: 12000

global_search:

max_tokens: 12000

data_max_tokens: 12000

map_max_tokens: 1000

reduce_max_tokens: 2000

concurrency: 32

Logs and screenshots

{"type": "error", "data": "Community Report Extraction Error", "stack": "Traceback (most recent call last):\n File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\index\graph\extractors\community_reports\community_reports_extractor.py", line 58, in call\n await self._llm(\n File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\llm\openai\json_parsing_llm.py", line 34, in call\n result = await self._delegate(input, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\llm\openai\openai_token_replacing_llm.py", line 37, in call\n return await self._delegate(input, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\llm\openai\openai_history_tracking_llm.py", line 33, in call\n output = await self._delegate(input, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\llm\base\caching_llm.py", line 104, in call\n result = await self._delegate(input, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\llm\base\rate_limiting_llm.py", line 177, in call\n result, start = await execute_with_retry()\n ^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\llm\base\rate_limiting_llm.py", line 159, in execute_with_retry\n async for attempt in retryer:\n File "C:\Python311\Lib\site-packages\tenacity\asyncio\init.py", line 166, in anext\n do = await self.iter(retry_state=self._retry_state)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "C:\Python311\Lib\site-packages\tenacity\asyncio\init.py", line 153, in iter\n result = await action(retry_state)\n ^^^^^^^^^^^^^^^^^^^^^^^^^\n File "C:\Python311\Lib\site-packages\tenacity\_utils.py", line 99, in inner\n return call(*args, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^\n File "C:\Python311\Lib\site-packages\tenacity\init.py", line 398, in \n self._add_action_func(lambda rs: rs.outcome.result())\n ^^^^^^^^^^^^^^^^^^^\n File "C:\Python311\Lib\concurrent\futures\_base.py", line 449, in result\n return self.__get_result()\n ^^^^^^^^^^^^^^^^^^^\n File "C:\Python311\Lib\concurrent\futures\_base.py", line 401, in __get_result\n raise self._exception\n File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\llm\base\rate_limiting_llm.py", line 165, in execute_with_retry\n return await do_attempt(), start\n ^^^^^^^^^^^^^^^^^^\n File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\llm\base\rate_limiting_llm.py", line 147, in do_attempt\n return await self._delegate(input, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\llm\base\base_llm.py", line 48, in call\n return await self._invoke_json(input, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\llm\openai\openai_chat_llm.py", line 90, in _invoke_json\n raise RuntimeError(FAILED_TO_CREATE_JSON_ERROR)\nRuntimeError: Failed to generate valid JSON output\n", "source": "Failed to generate valid JSON output", "details": null}
THE terminal OUT

INFO: Reading settings from settings.yaml
creating llm client with {'api_key': 'REDACTED,len=9', 'type': "openai_chat", 'model': 'mistral', 'max_tokens': 4000, 'temperature': 0.0, 'top_p': 1.0, 'request_timeout': 180.0, 'api_base': 'http://192.168.0.17:11434/v1', 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoint': None, 'deployment_name': None, 'model_supports_json': True, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25}
creating embedding llm client with {'api_key': 'REDACTED,len=9', 'type': "openai_embedding", 'model': 'nomic_embed_text', 'max_tokens': 4000, 'temperature': 0, 'top_p': 1, 'request_timeout': 180.0, 'api_base': 'http://192.168.0.17:11434/api', 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoint': None, 'deployment_name': None, 'model_supports_json': None, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25}
Error embedding chunk {'OpenAIEmbedding': 'Error raised by inference API HTTP code: 404, {"error":"model \"nomic_embed_text\" not found, try pulling it first"}'}
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in run_code
File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\query_main.py", line 76, in
run_local_search(
File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\query\cli.py", line 154, in run_local_search
result = search_engine.search(query=query)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\query\structured_search\local_search\search.py", line 118, in search
context_text, context_records = self.context_builder.build_context(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
selected_entities = map_query_to_entities(
^^^^^^^^^^^^^^^^^^^^^^
File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\query\context_builder\entity_extraction.py", line 54, in map_query_to_entities
search_results = text_embedding_vectorstore.similarity_search_by_text(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\vector_stores\lancedb.py", line 118, in similarity_search_by_text
query_embedding = text_embedder(text)
^^^^^^^^^^^^^^^^^^^
File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\query\context_builder\entity_extraction.py", line 56, in
text_embedder=lambda t: text_embedder.embed(t, encoding_format="float"), # added to make embedding api work, openai uses base64 by default
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\Anaconda3\envs\graphrag-ollama-local\graphrag-local-ollama\graphrag\query\llm\oai\embedding.py", line 99, in embed
chunk_embeddings = np.average(chunk_embeddings, axis=0, weights=chunk_lens)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\site-packages\numpy\lib\function_base.py", line 550, in average
raise ZeroDivisionError(
ZeroDivisionError: Weights sum to zero, can't be normalized

Additional Information

GraphRAG Version:
Operating System:
Python Version:
Related Issues:

maslenkovas · 2024-07-22T07:42:49Z

Facing the same error with locally deployed gte-small embeddings.

natoverse · 2024-07-22T20:24:43Z

Consolidating alternate model issues here: #657

yurochang · 2024-07-23T09:38:02Z

same problem and did not get solution from other issues

wangaocheng · 2024-07-23T14:30:44Z

--method local is not work

rushizirpe · 2024-07-25T11:28:14Z

The locally running embedding model in OLLAMA returns the weights in an incorrect format. OpenAI internally uses base64 encoded floats, whereas most other models return floats as numbers. However, If you want to use open-source models, I've put together a repository for deploying models from HuggingFace to local endpoints, having similar endpoints with compatible format as OpenAI API. Here’s the link to the repo: https://github.com/rushizirpe/open-llm-server

Also, I have created a Colab notebook (working for global as well as local search) for Graphrag: https://colab.research.google.com/drive/1uhFDnih1WKrSRQHisU-L6xw6coapgR51?usp=sharing

Ikaros-521 · 2024-07-26T03:13:23Z

same + 1

chongchongaikubao · 2024-07-26T09:02:43Z

我修改了这个地方，可能会对你有帮助：site-packages/graphrag/query/llm/oai/embedding.py

shaojiankui · 2024-08-05T01:04:50Z

修改文件:site-packages\graphrag\query\llm\text_utils.py里关于chunk_text()函数的定义:
`
def chunk_text(
text: str, max_tokens: int, token_encoder: tiktoken.Encoding | None = None
):
"""Chunk text by token length."""
if token_encoder is None:
token_encoder = tiktoken.get_encoding("cl100k_base")
tokens = token_encoder.encode(text) # type: ignore
tokens = token_encoder.decode(tokens) # 将tokens解码成字符串

chunk_iterator = batched(iter(tokens), max_tokens)
yield from chunk_iterator

`

zw-change added the triage Default label assignment, indicates new issue needs reviewed by a maintainer label Jul 22, 2024

jgbradley1 added the community_support Issue handled by community members label Jul 22, 2024

AlonsoGuevara removed the triage Default label assignment, indicates new issue needs reviewed by a maintainer label Jul 22, 2024

natoverse closed this as not planned Won't fix, can't repro, duplicate, stale Jul 22, 2024

Ikaros-521 mentioned this issue Jul 26, 2024

local搜索报错：ZeroDivisionError: Weights sum to zero, can't be normalized Ikaros-521/GraphRAG-Ollama-UI#15

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Issue]: <title> WHEN USED LOCAL OLLAMA MODEL ZeroDivisionError: Weights sum to zero, can't be normalized AND Error embedding chunk #646

[Issue]: <title> WHEN USED LOCAL OLLAMA MODEL ZeroDivisionError: Weights sum to zero, can't be normalized AND Error embedding chunk #646

zw-change commented Jul 22, 2024

maslenkovas commented Jul 22, 2024

natoverse commented Jul 22, 2024

yurochang commented Jul 23, 2024

wangaocheng commented Jul 23, 2024

rushizirpe commented Jul 25, 2024

Ikaros-521 commented Jul 26, 2024

chongchongaikubao commented Jul 26, 2024

shaojiankui commented Aug 5, 2024

[Issue]: <title> WHEN USED LOCAL OLLAMA MODEL ZeroDivisionError: Weights sum to zero, can't be normalized AND Error embedding chunk #646

[Issue]: <title> WHEN USED LOCAL OLLAMA MODEL ZeroDivisionError: Weights sum to zero, can't be normalized AND Error embedding chunk #646

Comments

zw-change commented Jul 22, 2024

Describe the issue

Steps to reproduce

GraphRAG Config Used

max_tokens: 4000

request_timeout: 180.0

api_version: 2024-02-15-preview

organization: <organization_id>

deployment_name: <azure_model_deployment_name>

tokens_per_minute: 150_000 # set a leaky bucket throttle

requests_per_minute: 10_000 # set a leaky bucket throttle

max_retries: 10

max_retry_wait: 10.0

sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times

concurrent_requests: 25 # the number of parallel inflight requests that may be made

num_threads: 50 # the number of threads to use for parallel processing

parallelization: override the global parallelization settings for embeddings

connection_string: <azure_blob_storage_connection_string>

container_name: <azure_blob_storage_container_name>

connection_string: <azure_blob_storage_connection_string>

container_name: <azure_blob_storage_container_name>

connection_string: <azure_blob_storage_connection_string>

container_name: <azure_blob_storage_container_name>

llm: override the global llm settings for this task

parallelization: override the global parallelization settings for this task

async_mode: override the global async_mode settings for this task

llm: override the global llm settings for this task

parallelization: override the global parallelization settings for this task

async_mode: override the global async_mode settings for this task

llm: override the global llm settings for this task

parallelization: override the global parallelization settings for this task

async_mode: override the global async_mode settings for this task

enabled: true

llm: override the global llm settings for this task

parallelization: override the global parallelization settings for this task

async_mode: override the global async_mode settings for this task

num_walks: 10

walk_length: 40

window_size: 2

iterations: 3

random_seed: 597832

text_unit_prop: 0.5

community_prop: 0.1

conversation_history_max_turns: 5

top_k_mapped_entities: 10

top_k_relationships: 10

max_tokens: 12000

max_tokens: 12000

data_max_tokens: 12000

map_max_tokens: 1000

reduce_max_tokens: 2000

concurrency: 32

Logs and screenshots

Additional Information

maslenkovas commented Jul 22, 2024

natoverse commented Jul 22, 2024

yurochang commented Jul 23, 2024

wangaocheng commented Jul 23, 2024

rushizirpe commented Jul 25, 2024

Ikaros-521 commented Jul 26, 2024

chongchongaikubao commented Jul 26, 2024

shaojiankui commented Aug 5, 2024