You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
question2:
What types of entities are supported? How do I make changes and statements?
example :
("entity"<|>"五千元罚款"<|>"legal clause No."<|>"指根据法律规定的对违法行为进行经济惩罚的方式之一。")##
("entity"<|>"构成犯罪"<|>"legal clause No."<|>"指行为触犯了刑法,应承担刑事责任。")##
Steps to reproduce
No response
GraphRAG Config Used
# Paste your config hereencoding_model: cl100k_baseskip_workflows: []llm:
api_key: ollamatype: openai_chat # or azure_openai_chatmodel: qwen2model_supports_json: true # recommended if this is available for your model.# max_tokens: 4000# request_timeout: 180.0api_base: http://10.108.246.106:11434/v1# api_version: 2024-02-15-preview# organization: <organization_id># deployment_name: <azure_model_deployment_name># tokens_per_minute: 150_000 # set a leaky bucket throttle# requests_per_minute: 10_000 # set a leaky bucket throttle# max_retries: 10# max_retry_wait: 10.0# sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times# concurrent_requests: 25 # the number of parallel inflight requests that may be madeparallelization:
stagger: 0.3# num_threads: 50 # the number of threads to use for parallel processingasync_mode: threaded # or asyncioembeddings:
## parallelization: override the global parallelization settings for embeddingsasync_mode: threaded # or asynciollm:
api_key: ollamatype: openai_embedding # or azure_openai_embeddingmodel: qwen2api_base: http://10.108.246.106:8080# api_version: 2024-02-15-preview# organization: <organization_id># deployment_name: <azure_model_deployment_name># tokens_per_minute: 150_000 # set a leaky bucket throttle# requests_per_minute: 10_000 # set a leaky bucket throttle# max_retries: 10# max_retry_wait: 10.0# sleep_on_rate_limit_recommendation: true # whether to sleep when azure suggests wait-times# concurrent_requests: 25 # the number of parallel inflight requests that may be madebatch_size: 1# the number of documents to send in a single request# batch_max_tokens: 8192 # the maximum number of tokens to send in a single request# target: required # or optionalchunks:
size: 300overlap: 100group_by_columns: [id] # by default, we don't allow chunks to cross documentsinput:
type: file # or blobfile_type: text # or csvbase_dir: "input"file_encoding: utf-8file_pattern: ".*\\.txt$"cache:
type: file # or blobbase_dir: "cache"# connection_string: <azure_blob_storage_connection_string># container_name: <azure_blob_storage_container_name>storage:
type: file # or blobbase_dir: "output/${timestamp}/artifacts"# connection_string: <azure_blob_storage_connection_string># container_name: <azure_blob_storage_container_name>reporting:
type: file # or console, blobbase_dir: "output/${timestamp}/reports"# connection_string: <azure_blob_storage_connection_string># container_name: <azure_blob_storage_container_name>entity_extraction:
## llm: override the global llm settings for this task## parallelization: override the global parallelization settings for this task## async_mode: override the global async_mode settings for this taskprompt: "prompts/entity_extraction.txt"entity_types: [organization,person,geo,vehicles,objects,acts,events,documents,legal clause No.,media,standard,legal term,attributes]max_gleanings: 0summarize_descriptions:
## llm: override the global llm settings for this task## parallelization: override the global parallelization settings for this task## async_mode: override the global async_mode settings for this taskprompt: "prompts/summarize_descriptions.txt"max_length: 500claim_extraction:
## llm: override the global llm settings for this task## parallelization: override the global parallelization settings for this task## async_mode: override the global async_mode settings for this task# enabled: trueprompt: "prompts/claim_extraction.txt"description: "Any claims or facts that could be relevant to information discovery."max_gleanings: 0community_report:
## llm: override the global llm settings for this task## parallelization: override the global parallelization settings for this task## async_mode: override the global async_mode settings for this taskprompt: "prompts/community_report.txt"max_length: 2000max_input_length: 8000cluster_graph:
max_cluster_size: 10embed_graph:
enabled: false # if true, will generate node2vec embeddings for nodes# num_walks: 10# walk_length: 40# window_size: 2# iterations: 3# random_seed: 597832umap:
enabled: false # if true, will generate UMAP embeddings for nodessnapshots:
graphml: falseraw_entities: falsetop_level_nodes: falselocal_search:
# text_unit_prop: 0.5# community_prop: 0.1# conversation_history_max_turns: 5# top_k_mapped_entities: 10# top_k_relationships: 10# max_tokens: 12000global_search:
# max_tokens: 12000# data_max_tokens: 12000# map_max_tokens: 1000# reduce_max_tokens: 2000# concurrency: 32
Logs and screenshots
No response
Additional Information
GraphRAG Version:0.1.1
Operating System:Ubuntu 22.04
Python Version:3.10
Related Issues:qwen2:7b
The text was updated successfully, but these errors were encountered:
yangxue-1
added
the
triage
Default label assignment, indicates new issue needs reviewed by a maintainer
label
Jul 29, 2024
Is there an existing issue for this?
Describe the issue
question 1:
After the content of the four prompt templates is changed to Chinese, the extracted relation format is incorrect.
examples 1:
("entity"<|>"新闻单位"<|>"media"<|>"指从事新闻采集、编辑和发布的媒体机构,如报纸、杂志、广播电台、电视台等。")##
("entity"<|>"出版单位"<|>"media"<|>"负责书籍、期刊、电子书等出版发行的机构。")##
("entity"<|>"广播电台"<|>"media"<|>"通过无线电波传播声音信息的媒体平台,通常用于新闻播报、音乐播放和节目主持。")##
("entity"<|>"电视台"<|>"media"<|>"利用电视信号传输图像与声音信息的媒体平台,提供新闻、娱乐、教育等各类节目内容。")##
("relationship"<|>"新闻单位"<|>"进行道路交通安全教育"<|>"新闻单位有义务通过报道和宣传来促进公众对道路交通安全的认识和理解。"<|>10)##
("relationship"<|>"出版单位"<|>"进行道路交通安全教育"<|>"出版物可以通过文章、书籍等形式传播道路交通安全知识,提高公众意识。"<|>10)##
("relationship"<|>""<|>"进行道路交通安全教育"<|>"通过广播节目向听众普及交通安全法规和常识。"<|>10)##
("relationship"<|>"电视台"<|>"进行道路交通安全教育"<|>"电视节目可以制作专题、访谈等形式,深入讲解交通安全知识与案例分析。"<|>10)##
example 2 :
("entity"<|>"车辆专用的或者与其相类似的标志图案"<|>"objects"<|>"指用于特定用途(如警车、消防车等)的车辆标识图案。")##
("entity"<|>"警报器"<|>"objects"<|>"一种发出高音警报声音以引起注意或警示的设备。")##
("entity"<|>"标志灯具"<|>"objects"<|>"用于提供视觉警示,通常在夜间或低能见度条件下使用,具有特定颜色和闪烁模式的灯具。")##
("relationship"<|>"上述车辆"<|>"vehicles"<|>"指包括但不限于警车、消防车等执行特殊任务的机动车辆。它们可能配备有专用标志图案、警报器或标志灯具以辅助其功能。"<|>10)##
("legal clause No."<|>"第二十五条"<|>"规定了全国实行统一的道路交通信号,包括交通信号灯、交通标志、交通标线和交通警察的指挥,并强调这些设施应符合安全畅通的要求及国家标准。")##
("legal clause No."<|>"第五十四条"<|>"允许道路养护车辆、工程作业车在进行作业时不受常规交通标志、标线限制,但要求过往车辆和人员注意避让以确保安全。")##
("acts"<|>"洒水"<|>"指对路面进行湿润或清洁的活动,可能与特定的道路维护或清洁任务相关联。")
question2:
What types of entities are supported? How do I make changes and statements?
example :
("entity"<|>"五千元罚款"<|>"legal clause No."<|>"指根据法律规定的对违法行为进行经济惩罚的方式之一。")##
("entity"<|>"构成犯罪"<|>"legal clause No."<|>"指行为触犯了刑法,应承担刑事责任。")##
Steps to reproduce
No response
GraphRAG Config Used
Logs and screenshots
No response
Additional Information
The text was updated successfully, but these errors were encountered: