-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Ollama][Other] GraphRAG OSS LLM community support #339
Comments
Calm down no need to yell. Looking at the logs it looks like they are removing the port from the api_base. settings.yaml -> api_base: "http://127.0.0.1:5000/v1" |
Sorry for question this, i have the same error and I'm not sure what you try to tell ? at least i did try |
This is anoying... I just tried switching to ollama because... my 1st attempt at running the solution against chat-gpt costed me 45$ and did not work at the end... so I don't want to waste money testing things like that. I would rather take it slow and steady locally until I get the hang of it and switch to a paid model if needed... How can we force the port to stay? I installed using pip install graphrag... I wish I knew what file to hack to keep the port intact. |
OLLAMA_HOST=127.0.0.1:11435 ollama serve ... now we just need to know which port graphrag is looking for |
Good news. I got it started. Key was to use the right config to set the concurrent request to 1:
|
I also managed to get the entity extraction working with Ollama. However, the embeddings seem to be more tricky due to no available OpenAI compatible API for embeddings from Ollama. Anyone found a workaround for this already? |
Is it the cause of this error afterprocessing the entities:
|
I configured mine as:
|
The crash log state:
and ollama log show:
|
use api instead v1 👍 at least i get a ok for the embedding, but format seems wrong |
Yes, there is no embeddings endpoint under the v1 of the OpenAI compatible server within Ollama. They are actively working on this: ollama/ollama#5285 |
Indeed, so I also tried the normal api endpoint as @dx111ge and having the same problem with the embedding output |
I also figured the |
did try all 3 different ollama embedding models , same error |
The weird thing... I reverted the embedings to be openai... but it try to connect to ollama instead... like it is getting the api_base from the llm config for the entities... I wonder what the right api_base might be for openai embeds... maybe we need to set it if we use a custom one for llm? |
OK, I have been able to specify the openai embeddings API (
Running out of memory on my 3090... I tried reducing the max_input_length to no avail:
|
can you please explain what you did ? to fix the embeddings stuff? |
Here is my final config. Somehow after VSCode crashed the summary reports started working when I started it again. Here is my final full config that work so far:
Essentially I use llama3 localy via ollama for the entities and use openai embeddings (much cheaper) until we have a solution to use ollama. |
I am sure the config could be optimised... but this is working at the moment... now I need to test the query part ;-) |
Well... look like I can't query the results. Keep getting VRAM errors on my 3090... so all this to not be able to query ;-( |
OpenAI's embeddings are quite expensive too... |
I figured the issue with the query... somehow the youtube video I was following was using the "wrong" syntax? Did not work: Worked: |
@bmaltais thanks a lot. That works. I think vllm has embeddings now, I will try that tonight for a fully local setup 👍 |
Quick update... Some of the issues I was having was related to the fact that my 1st attempt at running hraphrag was leveraginf chatgpt-4o. It ended-up creating a lot of files in the cache folder that then got mixed with llama3 generated files. Overall this caused significant issues. After deleting the cache folder and re-indexing everything I was able to properly query the graph with:
I still have not found an easy solution to generating embeddings locally. |
I use your setting and the default text, and do not change any other thing, but still ❌ create_final_entities
None
⠹ GraphRAG Indexer
├── Loading Input (InputFileType.text) - 1 files loaded (0 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_base_text_units
├── create_base_extracted_entities
├── create_summarized_entities
├── create_base_entity_graph
└── create_final_entities
❌ Errors occurred during the pipeline run, see logs for more details. |
Any sucess using vllm inference endpoint for local LLMs ? |
you will need to serve ollama first |
Does anyone face that Global query works but local query doesn't work? |
i followed your step still getting the same error its showing error invoking llm |
"Error Invoking LLM"-- fixed by using LM studio in embedding part. successfully build the graph, BUT can global search , CAN NOT local search: ZeroDivisionError: Weights sum to zero, can't be normalized |
[4 rows x 10 columns] |
same issue |
Consolidating Ollama-related issues: #657 |
remove cache |
How to solve this text_embed problem. I have the same problem. And then,the whole log problem is as follows datashaper.workflow.workflow ERROR Error executing verb "text_embed" in create_final_entities: Error code: 400 - {'object': 'error', 'message': 'NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.\n\n(The expanded size of the tensor (513) must match the existing size (512) at non-singleton dimension 1. Target sizes: [4, 513]. Tensor sizes: [1, 512])', 'code': 50001} |
did you solve it? |
@bmaltais hi,I don't understand what value should be set for your api-key in your example. Can you tell me more about it? Thanks. |
If you want to use open-source models, I've put together a repository for deploying models from HuggingFace to local endpoints, having similar endpoints with compatible format as OpenAI API. Here’s the link to the repo: https://github.com/rushizirpe/open-llm-server Also, I have created a Colab notebook (working for global as well as local search) for Graphrag: https://colab.research.google.com/drive/1uhFDnih1WKrSRQHisU-L6xw6coapgR51?usp=sharing |
I use ollama as local LLM API provider , and set chat model and embedding model api_base the same both with : (graphRAG) D:\Learn\GraphRAG>python -m graphrag.query --root ./newTest09 --method local "谁是叶文洁"
INFO: Reading settings from newTest09\settings.yaml
creating llm client with {'api_key': 'REDACTED,len=6', 'type': "openai_chat", 'model': 'qwen2', 'max_tokens': 4000, 'request_timeout': 180.0, 'api_base': 'http://localhost:11434/v1/', 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoint': None, 'deployment_name': None, 'model_supports_json': True, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25}
creating embedding llm client with {'api_key': 'REDACTED,len=6', 'type': "openai_embedding", 'model': 'nomic-embed-text', 'max_tokens': 4000, 'request_timeout': 180.0, 'api_base': 'http://localhost:11434/v1/', 'api_version': None, 'organization': None, 'proxy': None, 'cognitive_services_endpoint': None, 'deployment_name': None, 'model_supports_json': None, 'tokens_per_minute': 0, 'requests_per_minute': 0, 'max_retries': 10, 'max_retry_wait': 10.0, 'sleep_on_rate_limit_recommendation': True, 'concurrent_requests': 25}
Error embedding chunk {'OpenAIEmbedding': "Error code: 400 - {'error': {'message': 'invalid input type', 'type': 'api_error', 'param': None, 'code': None}}"}
Traceback (most recent call last):
File "D:\ProgramData\miniconda3\envs\graphRAG\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "D:\ProgramData\miniconda3\envs\graphRAG\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "D:\ProgramData\miniconda3\envs\graphRAG\lib\site-packages\graphrag\query\__main__.py", line 75, in <module>
run_local_search(
File "D:\ProgramData\miniconda3\envs\graphRAG\lib\site-packages\graphrag\query\cli.py", line 154, in run_local_search
result = search_engine.search(query=query)
File "D:\ProgramData\miniconda3\envs\graphRAG\lib\site-packages\graphrag\query\structured_search\local_search\search.py", line 118, in search
context_text, context_records = self.context_builder.build_context(
File "D:\ProgramData\miniconda3\envs\graphRAG\lib\site-packages\graphrag\query\structured_search\local_search\mixed_context.py", line 139, in build_context
selected_entities = map_query_to_entities(
File "D:\ProgramData\miniconda3\envs\graphRAG\lib\site-packages\graphrag\query\context_builder\entity_extraction.py", line 55, in map_query_to_entities
search_results = text_embedding_vectorstore.similarity_search_by_text(
File "D:\ProgramData\miniconda3\envs\graphRAG\lib\site-packages\graphrag\vector_stores\lancedb.py", line 118, in similarity_search_by_text
query_embedding = text_embedder(text)
File "D:\ProgramData\miniconda3\envs\graphRAG\lib\site-packages\graphrag\query\context_builder\entity_extraction.py", line 57, in <lambda>
text_embedder=lambda t: text_embedder.embed(t),
File "D:\ProgramData\miniconda3\envs\graphRAG\lib\site-packages\graphrag\query\llm\oai\embedding.py", line 96, in embed
chunk_embeddings = np.average(chunk_embeddings, axis=0, weights=chunk_lens)
File "D:\ProgramData\miniconda3\envs\graphRAG\lib\site-packages\numpy\lib\function_base.py", line 550, in average
raise ZeroDivisionError(
ZeroDivisionError: Weights sum to zero, can't be normalized I doubted maybe the embedding process doesn't work correct, beacause it mentioned code:400 error with OpenAIEmbedding api. But the indexing process seems working fine: ⠋ GraphRAG Indexer
├── Loading Input (text) - 1 files loaded (0 filtered) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:00:00 0:00:00
├── create_base_text_units
├── create_base_extracted_entities
├── create_summarized_entities
├── create_base_entity_graph
├── create_final_entities
├── create_final_nodes
├── create_final_communities
├── join_text_units_to_entity_ids
├── create_final_relationships
├── join_text_units_to_relationship_ids
├── create_final_community_reports
├── create_final_text_units
├── create_base_documents
└── create_final_documents
🚀 All workflows completed successfully. As my understanding, indexing process also need do embedding , how come it doesn't work when local search ? |
I had the same problem and I was wondering if anyone had completely solved it?
and my settings.yaml embeddings: parallelization: override the global parallelization settings for embeddingsasync_mode: threaded # or asyncio then run the command: ... ❌ create_final_entities checked the error log: |
|
The locally running embedding model in OLLAMA returns the weights in an incorrect format. OpenAI internally uses base64 encoded floats, whereas most other models return floats as numbers. This is working: https://github.com/rushizirpe/open-llm-server |
I have read the comment above, where the embedding uses openai, but I hope it will be local. : -) |
if you take a look at notebook you'll find it uses nomic-ai/nomic-embed-text-v1.5 as mentioned in .yaml config (You can use any valid model which can be loaded from huggingface). You need GROQ API key only in terms of chat completion aif you don't have higher end GPU, and if you have it then you just need to replace API endpoint as http://localhost:1234/v1 and model name (From HF) you want to use. |
oh...are you actually replacing ollama with open llm server in this notebook? |
Yes, hope it helps! |
I am experiencing the same issue here, from my understanding Ollama now has oai compatible api endpoint Note:
In the beginning I used The way to solve this is:
I specified 4096 as As for With these settings, indexing and global query work smoothly. Local query still has issue. |
local query issue is solved by #451 (comment) and https://github.com/microsoft/graphrag/pull/568/files , but the query result seems not so good. Update:
|
regarding generating embeddings with ollama.
At this point |
did you find a solution? |
What I tried:
I ran this on my local GPU and and tried replacing the api_base to a model served on ollama in settings.yaml file.
model: llama3:latest
api_base: http://localhost:11434/v1 #https://.openai.azure.com
Error:
graphrag.index.reporting.file_workflow_callbacks INFO Error Invoking LLM details={'input': '\n-Goal-\nGiven a text document that is pot....}
Commands:
#initialize
python -m graphrag.index --init --root .
#index
python -m graphrag.index --root .
#query
python -m graphrag.query --root . --method global "query"
#query
python -m graphrag.query --root . --method local "query"
Does graphrag support other llm hosted server frameworks?
The text was updated successfully, but these errors were encountered: