-
Notifications
You must be signed in to change notification settings - Fork 35
New course - genai-graphrag-python #420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
bef39cb
new course structure
martinohanlon 1850cc1
llm-knowledge-graph-construction updates
martinohanlon 401ce09
in progress updates
martinohanlon a9704bb
in progress updates
martinohanlon 2f8dfdc
in progress updates
martinohanlon ffd6920
in progress updates
martinohanlon f66d23d
in progress updats
martinohanlon 5433948
updates after walkthrough
martinohanlon 0c1a954
set image alts
martinohanlon 63e14b9
minor caption change
martinohanlon 97a774c
course summary
martinohanlon 73d15bd
QA review updates
martinohanlon 43f26df
minor update
martinohanlon 9479a76
lesson summary
martinohanlon 2bf98d8
updates post review
martinohanlon c864baf
update branch
martinohanlon 571da33
make course active
martinohanlon File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,8 +1,55 @@ | ||
| = Constructing Knowledge Graphs with Neo4j GraphRAG Python | ||
| :categories: llms:99 | ||
| = Constructing Knowledge Graphs with Neo4j GraphRAG for Python | ||
| :categories: llms:10, advanced:7, processing:5, generative-ai:4 | ||
| :status: active | ||
| :duration: 2 hours | ||
| :caption: Learn how to use Python and LLMs to convert unstructured data into knowledge graphs. | ||
| :usecase: blank-sandbox | ||
| :key-points: Create a knowledge graph using Neo4j GraphRAG for Python, Model a knowledge graph of structure and unstructured data, Query a knowledge graph using retrievers, Customize the knowledge graph build process | ||
| :repository: neo4j-graphacademy/genai-graphrag-python | ||
| :banner-style: light | ||
|
|
||
| In this course, you will learn how to: | ||
| == Course Description | ||
|
|
||
| * Use the `neo4j_graphrag` Python package to build graph retrieval agumented generation (GraphRAG) applications. | ||
| * Build pipelines to construct knowledge graphs from unstructured text. | ||
| * Combine semantic search and relationships to improve the quality of LLM generated responses. | ||
| In this hands-on course, you will learn how to create knowledge graphs using link:https://neo4j.com/docs/neo4j-graphrag-python/current/[Neo4j GraphRAG for Python^]. | ||
|
|
||
| You will: | ||
|
|
||
| * Use the `neo4j_graphrag` Python package to build knowledge graphs from unstructured data. | ||
| * Add structured data to the knowledge graph to improve LLM responses. | ||
| * Create retrievers to search the knowledge graph. | ||
| * Learn how you can customize the build process to suit your data and use case. | ||
|
|
||
| Finally, you will use what you have learned to build a knowledge graph from your data. | ||
|
|
||
| === Prerequisites | ||
|
|
||
| This is an advanced course and you should: | ||
|
|
||
| * Understand graph and Neo4 fundamental concepts - link:/courses/neo4j-fundamentals[Neo4j and Graph Fundamentals^]. | ||
| * Have an understanding of how Generative AI, LLMs, and vector indexes are related to Neo4j - link:/courses/genai-fundamentals[Neo4j & GenerativeAI Fundamentals^]. | ||
| * Be able to read and write simple Cypher queries - link:/courses/cypher-fundamentals[Cypher Fundamentals^]. | ||
| * Understand how you can use an LLM to generate a knowledge graph - link:/courses/llm-knowledge-graph-construction[https://graphacademy.neo4j.com/courses/llm-knowledge-graph-construction/^]. | ||
| * Have experience with programming in Python. | ||
|
|
||
| === Duration | ||
|
|
||
| {duration} | ||
|
|
||
| === What you will learn | ||
|
|
||
| How to: | ||
|
|
||
| * Use the Neo4j GraphRAG for Python package to create a knowledge graph from unstructured data. | ||
| * Enhance a knowledge graph by adding structured data. | ||
| * Create Retrievers to search a knowledge graph. | ||
| * Customize the knowledge graph build process to suit your data and use case. | ||
| * Model a knowledge graph of both structured and unstructured data. | ||
|
|
||
|
|
||
|
|
||
| [.includes] | ||
| == This course includes | ||
|
|
||
| * [lessons]#16 lessons# | ||
| * [challenges]#7 hands-on challenges# | ||
| * [quizes]#8 simple quizzes to support your learning# |
159 changes: 159 additions & 0 deletions
159
asciidoc/courses/genai-graphrag-python/illustration.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+108 KB
...les/1-introduction/lessons/1-knowledge-graph-construction/images/neo4j-wiki.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
169 changes: 169 additions & 0 deletions
169
...ython/modules/1-introduction/lessons/1-knowledge-graph-construction/lesson.adoc
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,169 @@ | ||
| = Constructing Knowledge Graphs | ||
| :type: lesson | ||
| :order: 1 | ||
|
|
||
| In this lesson you will review the process of constructing knowledge graphs from unstructured text using an LLM. | ||
|
|
||
| == The construction process | ||
|
|
||
| Typically, you would follow these steps: | ||
|
|
||
| . Gather the data | ||
| . Chunk the data | ||
| . _Vectorize_ the data | ||
| . Pass the data to an LLM to extract nodes and relationships | ||
| . Use the output to generate the graph | ||
|
|
||
| === Gather your data sources | ||
|
|
||
| The first step is to gather your unstructured data. | ||
| The data can be in the form of text documents, PDFs, publicly available data, or any other source of information. | ||
|
|
||
| Depending on the format, you may need to reformat the data into a format (typically text) that the LLM can process. | ||
|
|
||
| The data sources should contain the information you want to include in your knowledge graph. | ||
|
|
||
| === Chunk the data | ||
|
|
||
| The next step is to break down the data into _right-sized_ parts. | ||
| This process is known as _chunking_. | ||
|
|
||
| The size of the chunks depends on the LLM you are using, the complexity of the data, and what you want to extract from the data. | ||
|
|
||
| You may not need to chunk the data if the LLM can process the entire document at once and it fits your requirements. | ||
|
|
||
| === Vectorize the data | ||
|
|
||
| Depending on your requirements for querying and searching the data, you may need to create *vector embeddings*. | ||
| You can use any embedding model to create embeddings for each data chunk, but the same model must be used for all embeddings. | ||
|
|
||
| Placing these vectors into a link:https://neo4j.com/docs/cypher-manual/current/indexes/semantic-indexes/vector-indexes/[Vector index^] allows you to perform semantic searches, similarity searches, and clustering on the data. | ||
|
|
||
| [TIP] | ||
| .Chunking, Vectors, and Similarity Search | ||
| You can learn more about how to chunk documents, vectors, similarity search, and embeddings in the GraphAcademy course link:https://graphacademy.neo4j.com/courses/llm-vectors-unstructured/1-introduction/2-semantic-search/[Introduction to Vector Indexes and Unstructured Data^]. | ||
|
|
||
| === Extract nodes and relationships | ||
|
|
||
| The next step is to pass the unstructured text data to the LLM to extract the nodes and relationships. | ||
|
|
||
| You should provide a suitable prompt that will instruct the LLM to: | ||
|
|
||
| - Identify the entities in the text. | ||
| - Extract the relationships between the entities. | ||
| - Format the output so you can use it to generate the graph, for example, as JSON or another structured format. | ||
|
|
||
| Optionally, you may also provide additional context or constraints for the extraction, such as the type of entities or relationships you are interested in extracting. | ||
|
|
||
|
|
||
| === Generate the graph | ||
|
|
||
| Finally, you can use the output from the LLM to generate the graph by creating the nodes and relationships within Neo4j. | ||
|
|
||
| The entity and relationship types would become labels and relationship types in the graph. | ||
| The _names_ would be the node and relationship identifiers. | ||
|
|
||
| == Example | ||
|
|
||
| If you wanted to construct a knowledge graph based on the link:https://en.wikipedia.org/wiki/Neo4j[Neo4j Wikipedia page^], you would: | ||
|
|
||
| . **Gather** the text from the page. + | ||
| + | ||
| image::images/neo4j-wiki.png["A screenshot of the Neo4j wiki page"] | ||
| . Split the text into **chunks**. | ||
| + | ||
| Neo4j is a graph database management system (GDBMS) developed | ||
| by Neo4j Inc. | ||
| + | ||
| {sp} | ||
| + | ||
| The data elements Neo4j stores are nodes, edges connecting them, | ||
| and attributes of nodes and edges... | ||
|
|
||
| . Generate **embeddings** and **vectors** for each chunk. | ||
| + | ||
| [0.21972137987, 0.12345678901, 0.98765432109, ...] | ||
|
|
||
| . **Extract** the entities and relationships using an **LLM**. | ||
| + | ||
| Send the text to the LLM with an appropriate prompt, for example: | ||
| + | ||
| Your task is to identify the entities and relations requested | ||
| with the user prompt from a given text. You must generate the | ||
| output in a JSON format containing a list with JSON objects. | ||
|
|
||
| Text: | ||
| {text} | ||
| + | ||
| Parse the entities and relationships output by the LLM. | ||
| + | ||
| [source, json] | ||
| ---- | ||
| { | ||
| "node_types": [ | ||
| { | ||
| "label": "GraphDatabase", | ||
| "properties": [ | ||
| { | ||
| "name": "Neo4j", "type": "STRING" | ||
| } | ||
| ] | ||
| }, | ||
| { | ||
| "label": "Company", | ||
| "properties": [ | ||
| { | ||
| "name": "Neo4j Inc", "type": "STRING" | ||
| } | ||
| ] | ||
| }, | ||
| { | ||
| "label": "Programming Language", | ||
| "properties": [ | ||
| { | ||
| "name": "Java", "type": "STRING" | ||
| } | ||
| ] | ||
| } | ||
| ], | ||
| "relationship_types": [ | ||
| { | ||
| "label": "DEVELOPED_BY" | ||
| }, | ||
| { | ||
| "label": "IMPLEMENTED_IN" | ||
| } | ||
| ], | ||
| "patterns": [ | ||
| ["Neo4j", "DEVELOPED_BY", "Neo4j Inc"], | ||
| ["Neo4j", "IMPLEMENTED_IN", "Java"], | ||
| ] | ||
| } | ||
| ---- | ||
| . **Generate** the graph. | ||
| + | ||
| Use the data to construct the graph in Neo4j by creating nodes and relationships based on the entities and relationships extracted by the LLM. | ||
| + | ||
| [source, cypher, role=noplay nocopy] | ||
| .Generate the graph | ||
| ---- | ||
| MERGE (neo4jInc:Company {id: 'Neo4j Inc'}) | ||
| MERGE (neo4j:GraphDatabase {id: 'Neo4j'}) | ||
| MERGE (java:ProgrammingLanguage {id: 'Java'}) | ||
| MERGE (neo4j)-[:DEVELOPED_BY]->(neo4jInc) | ||
| MERGE (neo4j)-[:IMPLEMENTED_IN]->(java) | ||
| ---- | ||
|
|
||
|
|
||
|
|
||
| [.quiz] | ||
| == Check your understanding | ||
|
|
||
| include::questions/1-steps.adoc[leveloffset=+1] | ||
|
|
||
| [.summary] | ||
| == Lesson Summary | ||
|
|
||
| In this lesson, you learned about how to construct a knowledge graph. | ||
|
|
||
| In the next lesson, you will setup your development environment to build knowledge graphs using Python and Neo4j. |
27 changes: 27 additions & 0 deletions
27
...es/1-introduction/lessons/1-knowledge-graph-construction/questions/1-steps.adoc
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| [.question] | ||
| = 1. Knowledge graph construction steps | ||
|
|
||
| Which of the following steps could be considered **optional**? | ||
|
|
||
| * [ ] Gather your data sources | ||
| * [x] Chunk the data | ||
| * [x] _Vectorize_ the data | ||
| * [ ] Pass the data to an LLM to extract nodes and relationships | ||
| * [ ] Use the output to generate the graph | ||
|
|
||
| [TIP,role=hint] | ||
| .Hint | ||
| ==== | ||
| The essential parts of the process are obtaining the data to pass to the LLM and using the output to generate the graph. | ||
| ==== | ||
|
|
||
| [TIP,role=solution] | ||
| .Solution | ||
| ==== | ||
| The optional steps are: | ||
|
|
||
| * Chunk the data | ||
| * _Vectorize_ the data | ||
|
|
||
| It may not be necessary to chunk the data or vectorize it depending on the LLM you are using, the complexity of the data, and your requirements. | ||
| ==== |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
15 changes: 15 additions & 0 deletions
15
asciidoc/courses/genai-graphrag-python/modules/1-introduction/module.adoc
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| = Introduction | ||
| :order: 1 | ||
|
|
||
| Welcome to Constructing Knowledge Graphs with Neo4j GraphRAG for Python. | ||
|
|
||
| == Module Overview | ||
|
|
||
| In this module, you will: | ||
|
|
||
| * Review the process of creating knowledge graphs from unstructured text. | ||
| * Setup a development environment to build your own knowledge graph. | ||
|
|
||
| If you are ready, let's get going! | ||
|
|
||
| link:./1-knowledge-graph-construction/[Ready? Let's go →, role=btn] |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.