Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: AttributeError: 'list' object has no attribute 'on_error' #1505

Open
3 tasks done
NathanAP opened this issue Dec 12, 2024 · 24 comments
Open
3 tasks done

[Bug]: AttributeError: 'list' object has no attribute 'on_error' #1505

NathanAP opened this issue Dec 12, 2024 · 24 comments
Labels
awaiting_response Maintainers or community have suggested solutions or requested info, awaiting filer response bug Something isn't working stale Used by auto-resolve bot to flag inactive issues triage Default label assignment, indicates new issue needs reviewed by a maintainer

Comments

@NathanAP
Copy link

Do you need to file an issue?

  • I have searched the existing issues and this bug is not already filed.
  • My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
  • I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.

Describe the bug

Sometimes when I try to build_index, I'm getting the following:

index_result = await api.build_index(config=my_config) # this is me calling it
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/graphrag/api/index.py", line 73, in build_index
async for output in run_pipeline_with_config(
File "/usr/local/lib/python3.12/site-packages/graphrag/index/run/run.py", line 172, in run_pipeline_with_config
async for table in run_pipeline(
File "/usr/local/lib/python3.12/site-packages/graphrag/index/run/run.py", line 277, in run_pipeline
cast("WorkflowCallbacks", callbacks).on_error(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'on_error'

Not sure if there's anything I can do to fix it in my side...

Steps to reproduce

This is happening very rarely, tbh. I didn't find a consist that makes it appear.

Expected Behavior

Index should be done normally.

GraphRAG Config Used

# Paste your config here
async_mode: threaded
cache:
  base_dir: cache
  type: file
chunks:
  group_by_columns:
  - id
  overlap: 0
  size: 600
claim_extraction:
  description: Any claims or facts that could be relevant to information discovery.
  enabled: false
  max_gleanings: 1
  prompt: prompts/claim_extraction.txt
cluster_graph:
  max_cluster_size: 10
community_reports:
  max_input_length: 8000
  max_length: 2000
  prompt: prompts/community_report.txt
drift_search:
  prompt: prompts/drift_search_system_prompt.txt
embed_graph:
  enabled: false
embeddings:
  async_mode: threaded
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    model: text-embedding-3-small
    type: openai_embedding
  vector_store:
    container_name: default
    db_uri: output/lancedb
    overwrite: true
    type: lancedb
encoding_model: cl100k_base
entity_extraction:
  entity_types:
  - organization
  - person
  - geo
  - event
  max_gleanings: 1
  prompt: prompts/entity_extraction.txt
global_search:
  knowledge_prompt: prompts/global_search_knowledge_system_prompt.txt
  map_prompt: prompts/global_search_map_system_prompt.txt
  reduce_prompt: prompts/global_search_reduce_system_prompt.txt
input:
  base_dir: input
  file_encoding: utf-8
  file_pattern: .*\.txt$
  file_type: text
  type: file
llm:
  api_key: ${GRAPHRAG_API_KEY}
  model: gpt-4o-mini
  model_supports_json: true
  type: openai_chat
local_search:
  prompt: prompts/local_search_system_prompt.txt
parallelization:
  stagger: 0.3
reporting:
  base_dir: logs
  type: file
skip_workflows: []
snapshots:
  embeddings: false
  graphml: false
  transient: false
storage:
  base_dir: output
  type: file
summarize_descriptions:
  max_length: 500
  prompt: prompts/summarize_descriptions.txt
umap:
  enabled: false
update_index_storage: null

Logs and screenshots

No response

Additional Information

  • GraphRAG Version: 0.9.0
  • Operating System: Linux
  • Python Version: 3.12.7
  • Related Issues: didn't find anything like it
@NathanAP NathanAP added bug Something isn't working triage Default label assignment, indicates new issue needs reviewed by a maintainer labels Dec 12, 2024
@adhazel
Copy link

adhazel commented Dec 12, 2024

@NathanAP I created a graphrag project this morning and am running in the same problem. My steps are below and mostly from the getting started steps...

  1. Create a local folder
  2. pip install poetry
  3. poetry init
  4. poetry shell
  5. pip install graphrag
  6. mkdir -p ./ragtest/input
  7. curl https://www.gutenberg.org/cache/epub/24022/pg24022.txt -o ./ragtest/input/book.txt
  8. Modify .env file with my AOAI key
  9. Modify the settings.yaml file under "llm:" block. I modified the embedding block first and missed the first one.
  10. graphrag index --root ./ragtest
  11. I got the error ***
  12. I modified the settings.yaml file under the FIRST llm block
  13. I got the error ***

Other than running the index first and failing once due to not having the correct settings, it is pretty clean repro steps

Additional information:

  • GraphRAG Version v1.0.0
  • Operating System: Windows
  • Python Version: 3.11.9

@adhazel
Copy link

adhazel commented Dec 12, 2024

@NathanAP, my problem seemed to resolve itself once I uncommented and set the api_version values in the settings.yaml file. That said, I still think the run.py file will always error if callbacks is a list. Perhaps type check/handling would be good here or another look.

    cast("WorkflowCallbacks", callbacks).on_error(
        "Error running pipeline!", e, traceback.format_exc()
    )

@NathanAP
Copy link
Author

In my case I'm using always the same file(s) to index it because we're using Playwright to make some end to end tests. This morning I got 3 times in a row this same error and then never happened again...

TBH I have no idea if this is in our control to manage.

@alexfrocha
Copy link

me too

@YepJin
Copy link

YepJin commented Dec 12, 2024

same here, i just updated and found this bug.


This bug lasted for around 2 hours and now it disappeared, as if it never happened. :)

@natoverse
Copy link
Collaborator

Can someone with this error upload an indexing-engine.log so we can see the stack trace? I had the error myself yesterday with a largish dataset, and it was due to too many API requests (mine was on Azure OpenAI). It does not seem consistency, so may be temporary instability issues?

@natoverse natoverse added the awaiting_response Maintainers or community have suggested solutions or requested info, awaiting filer response label Dec 12, 2024
@xldistance
Copy link

@natoverse settings.yaml

llm:
  # exllamav2
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat # or azure_openai_chat
  model: Rombos-LLM-V2.5-Qwen-32b-4.5bpw-exl2
  model_supports_json: true # recommended if this is available for your model.
  max_tokens: 16000
  api_base: http://127.0.0.1:5001/v1
  requests_per_minute: 5_000 # set a leaky bucket throttle
  max_retries: 10
  max_retry_wait: 0

graphrag 1.0 does not call the api_base inside llm, and uses the default api_base of openai, resulting in model authentication errors,The error log is as follows:
11:34:31,347 httpx INFO HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 401 Unauthorized"

@marvinmednick
Copy link

I've just setup Graphrag and was seeing this trying to go through the Getting Started example. I did review the indexing-engine.log and found a couple of things:

My initial problem was the my deployments string was incorrect. This was resulting in a 404 error the request to do the chat completion while extracting the graph - The full URL with the issue was in the and I compared it to a working URL I had from test code and found my issue (I had the deployment name incorrect -- I was using the azure resource name and it needed to be the openai deployment name (in my case gpt-4_1 (I'm using azure openai).

I got thorough that and it make it further but I started getting it later in the process, but now I'm seeing a 429 error (rate limit error)

For references both case ended up with the list error in the -- but that looks like a secondary error that occurred while processing another error. In my case the first error was a key error on the key 'name' I'm suspecting that the call to the LLM fails and so there is no data, which causes the key error (and then the list error occurs during the handling of the key error)

@xldistance
Copy link

Refer to this fix
#1508

@bode135
Copy link

bode135 commented Dec 13, 2024

same error, how to fix it?

@alexfrocha
Copy link

yeah guys me too fr fr

@wodecki
Copy link

wodecki commented Dec 13, 2024

the same error for me...

@YepJin
Copy link

YepJin commented Dec 13, 2024

Can someone with this error upload an indexing-engine.log so we can see the stack trace? I had the error myself yesterday with a largish dataset, and it was due to too many API requests (mine was on Azure OpenAI). It does not seem consistency, so may be temporary instability issues?

Below is my indexing.log, it is clean as I only run it once. I am using gpt-4o from openai and the document is about some video description. @natoverse

indexing-engine.log

@marvinmednick
Copy link

@YepJin
I notice that the first error in your log is:
13:31:33,876 graphrag.index.graph.extractors.graph.graph_extractor ERROR error extracting graph
Traceback (most recent call last):
File "/home/klwonder/vscode/graphrag_v3/.venv/lib/python3.12/site-packages/graphrag/index/graph/extractors/graph/graph_extractor.py", line 127, in call
result = await self._process_document(text, prompt_variables)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/klwonder/vscode/graphrag_v3/.venv/lib/python3.12/site-packages/graphrag/index/graph/extractors/graph/graph_extractor.py", line 156, in _process_document
self._extraction_prompt.format(**{
KeyError: '"LOW AUTHENTIC VIDEO", \n"MODERATE LOW AUTHENTIC VIDEO","MEDIUM VIDEO","MODERATE HIGH AUTHENTIC VIDEO","HIGH AUTHENTIC VIDEO"'

I also notice that the Key error seems to be for the entire string (everything between the single quotes), so maybe this is an issue in some entry in the settings.yaml? Are these items (e.g. "LOW AUTHENTIC VIDEO", etc) configured as there in some way or in a format that it can't handle properly or doesn't like (are these your entity types?) ... its seems to be taking what looks like it should several different entries and treating it as one item

@YepJin
Copy link

YepJin commented Dec 13, 2024

I used my own prompts and not limit the entity types. It worked perfectly before I update it (previously I used the version without Drift search).

@YepJin I notice that the first error in your log is: 13:31:33,876 graphrag.index.graph.extractors.graph.graph_extractor ERROR error extracting graph Traceback (most recent call last): File "/home/klwonder/vscode/graphrag_v3/.venv/lib/python3.12/site-packages/graphrag/index/graph/extractors/graph/graph_extractor.py", line 127, in call result = await self._process_document(text, prompt_variables) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/klwonder/vscode/graphrag_v3/.venv/lib/python3.12/site-packages/graphrag/index/graph/extractors/graph/graph_extractor.py", line 156, in _process_document self._extraction_prompt.format(**{ KeyError: '"LOW AUTHENTIC VIDEO", \n"MODERATE LOW AUTHENTIC VIDEO","MEDIUM VIDEO","MODERATE HIGH AUTHENTIC VIDEO","HIGH AUTHENTIC VIDEO"'

I also notice that the Key error seems to be for the entire string (everything between the single quotes), so maybe this is an issue in some entry in the settings.yaml? Are these items (e.g. "LOW AUTHENTIC VIDEO", etc) configured as there in some way or in a format that it can't handle properly or doesn't like (are these your entity types?) ... its seems to be taking what looks like it should several different entries and treating it as one item

@natoverse natoverse removed the awaiting_response Maintainers or community have suggested solutions or requested info, awaiting filer response label Dec 16, 2024
@Shenjingbang
Copy link

Shenjingbang commented Dec 17, 2024

[Bug]: AttributeError: 'list' object has no attribute 'on_error'
This problem exists for graphrag 1.0

Just change two things in settings.yaml:
Add the following to the llm:
api_base: https://api...... (Your website)
api_proxy: https://api......

Add llm in embeddings as follows:
api_base: https://api......
proxy: https://api......

Thanks to Engineer Yang Qi for his efforts.

@bdytx5
Copy link

bdytx5 commented Dec 17, 2024

this can happend if your api key is not valid for accessing the embedding api. try generating a new key in OpenAI and retrying ---- this will be evident in the log file

@NathanAP
Copy link
Author

Hmm but then this error would be consistent, wouldn't? I'm getting this error 30% of my index attempts, its not happening everytime. Seems like some instability

@bdytx5
Copy link

bdytx5 commented Dec 17, 2024

@NathanAP Sounds like a rate limiting issue?

@NathanAP
Copy link
Author

Seems like a bug in GraphRag code. I'm always using the same files.

@nightzjp
Copy link

有解决了这个问题的friend吗?

@natoverse
Copy link
Collaborator

We believe this was a bug introducing during our adoption of fnllm as the underlying LLM library. We just pushed out a 1.0.1 patch today, please let if know if your problem still exists with that version.

@natoverse natoverse added the awaiting_response Maintainers or community have suggested solutions or requested info, awaiting filer response label Dec 18, 2024
@win4r
Copy link

win4r commented Dec 20, 2024

🔥修改代码就可以解决 https://youtu.be/GRZ2th6s7uY

Copy link

This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days.

@github-actions github-actions bot added the stale Used by auto-resolve bot to flag inactive issues label Dec 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting_response Maintainers or community have suggested solutions or requested info, awaiting filer response bug Something isn't working stale Used by auto-resolve bot to flag inactive issues triage Default label assignment, indicates new issue needs reviewed by a maintainer
Projects
None yet
Development

No branches or pull requests