Skip to content

Design-Computation-RWTH/profile-rag-tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introcuction to the repository

This repository provides a tutorial on how to use n8n (a workflow automation tool) together with a vector database (Qdrant) and Ollama for creating a custom knowledge retrieval-augmented generation (RAG) chatbot. The chatbot is able to answer questions based on a predefined persona, which is stored in a text file and embedded into vectors using an embedding model. The vectors are then stored in the Qdrant vector database, which allows for efficient similarity search and retrieval of relevant information. The chatbot can then use this information to generate responses based on the persona's background, experiences, and needs. The tutorial is roughly based on Mihai Farcas blog post on how to Build a custom knowledge RAG chatbot using n8n.

This tutorial is part of a Research Seminar "Nah am Nutzen" for architecture students at the Design Computation, Department of Architecture, RWTH Aachen University. The seminar is focused on exploring the use of AI in architecture and design, and this repository serves as a practical example of how to integrate AI tools into a workflow for architects and designers to consider the needs and experiences of people in their designs.

The following steps are covered in this repository:

  1. Setting up the prerequisites: Install Docker and Docker Compose, and set up the environment.
  2. Setting up the environment: Create a .env file with the necessary environment variables.
  3. Running the docker compose: Start the n8n, Qdrant, and Ollama services using Docker Compose.
  4. Setting up the vector database: Create a collection in Qdrant.
  5. Creating a n8n workflow to populate the vector database: Use n8n to extract text from documents and store it in Qdrant.
  6. Creating a n8n workflow to chat with an AI Agent: Build a workflow that allows you to chat with an AI agent using the data stored in Qdrant.
  7. On your own: Suggestions for further exploration and experimentation with the workflow and the tools used.
  8. Some remarks: Additional information and considerations regarding the tools and workflows used in this tutorial.
  9. Troubleshooting: Common issues and solutions when setting up the environment and running the workflows.

Prerequisites

  • IDE (optional): While not strictly necessary, using an IDE like Visual Studio Code can make it easier to manage your code and environment files.

  • Docker: Make sure you have Docker installed on your machine. You can download it from Docker's official website.

  • Computer Performance: Keep in mind that running multiple services and AI modal locally may require a decent amount of RAM and CPU power. Don't just download any model, but choose one that fits your machine's capabilities. Ollama states in it's repository:

    "You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 13B models, and 32 GB to run the 33B models." Source

Setting up the prerequisites

First we need to install Docker. Docker is a platform that allows you to run applications in containers, which are lightweight and portable. It helps us to run different applications and services without worrying about dependencies and configurations (at leat for the most part). Docker Compose is a tool / file format that allows you to write a simple YAML file to define what applications and services you want to run. In our case, we want to use Docker Compose to run n8n, Qdrant, and Ollama together.

You can download Docker from Docker's official website.

Setting up the environment

In order to run some application, we first need to set up some environment varialbes. These variables will be used by the applications to configure themselves. For example, some applications such as n8n need to set up a username and password to access their web interface. We can simply create a .env file in the root of the repository and add the following variables:

QDRANT_API_KEY=your-qdrant-api-key
[email protected]
N8N_PASSWORD=SuperSecurePassword1

Don't forget to replace the strings after the = with your own values.

Please note: Environment variables are sensitive information. Do not share them publicly or commit them to a public repository. Eventhough this is a simple test repository, it is still good practice to keep your environment variables secure. Therefore, we always accompany our repositories with a .gitignore file that excludes the .env file from being uploaded to the repository.

Running the docker compose

To run the applications, we need to tell docker to start the services defined in the docker-compose.yml file. This file is already provided in the repository and contains the configuration for n8n, Qdrant, and Ollama. The setup for n8n and Qdrant is quite straightforward, but Ollama requires a bit more configuration if we want to preload some specific models. Therefore we set up the dockerfile file to include the models we want to use. In this case, we are using llama3.2:3b for text generation and the nomic-embed-text:v1.5 model for embedding the text into vectors. To start the services, open a terminal in the root of the repository and run the following command:

docker compose up -d

This command will start the services in detached mode, meaning they will run in the background. You can check the status of the services in Docker Desktop or by running the following command in the terminal:

docker compose ps

In order to check what models are available in Ollama, you can run the following command:

docker exec -it ollama ollama list

If you want to add more models, you can simply use the command in the terminal:

docker exec -it ollama ollama pull <model-name>

A list of available models can be found on the Ollama website. Keept the performance considerations in mind!

Setting up the vector database

In order to use the Qdrant vector database in our workflows, we first need to create a collection. A collection is a logical grouping of vectors that can be queried together. We can create a collection using the Qdrant API or the Qdrant web interface. In this tutorial, we will only use the web interface, but you can also use the API if you prefer. To create a collection, open the Qdrant web interface in your browser at http://localhost:6333/dashboard. You should see a page like this where you are prompted to enter your API key (set in the .env file): Enter the API key in Qdrant

In Qdrant, we create a new collection by using the Console tab in the left sidebar. Creating a new collection in Qdrant via the console

Delete all existing code in the console and write the following code to create a new collection called profile-collection with a vector that has 768 dimensions and the cosine distance metric:

PUT /collections/profile-collection
{
  "name": "profile-collection",
  "vectors": {
    "size": 768,
    "distance": "Cosine"
  }
}

The size of the vector must match the dimensions of the embedding model we are using. In this case we are using the nomic-embed-text:v1.5 model, which produces 768-dimensional vectors. The cosine distance metric is commonly used for text embeddings, as it measures the similarity between vectors based on their angle rather than their magnitude. In order to see how many dimensions our embedding model has, we can use the following command in the terminal:

docker exec -it ollama ollama info nomic-embed-text:v1.5

The value of embedding length in the output will tell you how many dimensions the embedding model has. In this case, it should be 768. Once you have entered the code, click on the "Run" button to execute the command. If everything is set up correctly, you should see a response indicating that the collection has been created successfully.

Keep in mind: This is a very simple tutorial. We don't provide a workflow to update the collection or to only upload new documents. If you want to play around with the vector database and upload new documents, don't forget to delete the existing collection first and recreate it using the same code as above. Qdrant should save the configuration in the console, so you can simply click on the "Run" button again to recreate the collection.

If you have more questions regarding Qdrant, you can check out the Qdrant documentation.

Creating a n8n workflow to populate the vector database

Now it is time to create our first workflow in n8n. We want to use the example data that we have provided here in the data folder. It is a very basic description of a persona that we want to use in our chatbot. The chatbot should react to our messages based on the experiences and needs of this persona, defined in the text file. Therefore, we first need to embed the text into vectors and store it on the Qdrant vector database. We can open n8n in our browser at http://localhost:5678. It will require us to set up a username and password. Creating a user for n8n Use the values from the .envfile that we created earlier. We don't need to store the credentials in the env file, but this way we will not lose them 😉. Once you are logged in, click on Create from scratch to create a new workflow. You should see a blank canvas such as this: The n8n canvas after creating a new workflow Click on the +icon and select the 'Trigger manually' node. This will allow us to execute our workflows by clicking the "Execute Workflow" button in the n8n interface. But as of now there is not much to execute 😉. Next, we need to specify the data that we want to embed and store. Therefore, we need to click on the small + icon and search for Read/Write Files from Disk. Click on it and in the following screen click on Read File(s) from Disk. A window opens where we can set up the file path to the text file. TODO: Say something about the file path and docker here... Since our docker container is running in a separate environment, we cannot simply point to files on our local machine. Instead, we need to point to the files in the Docker container. In our case, we have mounted the data folder from the root of the repository to the /data folder in the Docker container. Therefore, we can simply use the path /data/example-profile.txt to access the text file.

Setting up the file path in the Read/Write Files from Disk node

If you want to use different files, you can simply add them to the data folder in the root of the repository and use the corresponding path in the n8n node.

Files in the Docker Container

You can check whether the file was read correctly by clicking on the Execute step button in the top right corner of the n8n window. Next, we want to embed the text into vectors and store it in Qdrant. Therefore, we create a new node by clicking on the + icon and enter Qdrant in the search field. Select the Qdrant Vector Store and select the Add documents to vector store action. In the following screen, we first need to set up the connection to our vector database. Click on Select Credential and then on Create new credential. In the new window enter the the URL where Qdrant is running, which is http://qdrant:6333 in our case and the API key we set up in the .env file. Click on Save in order to see that everything is set up correctly. A common issue is, that you have a typo in the API key or URL, or you forgot to put the http:// in front of the URL. If everything is set up correctly, you should see a green checkmark next to the Connection field and can close the window. Back in the Qdrant Vector Store node, we need to select the collection we created earlier. n8n automatically detects all collections in out Vector database, so you should see profile-collection in the dropdown menu. If you don't see it, make sure that you have created the collection correctly and that the connection is set up properly. Setting up the connection to the Qdrant Vector Store Our workflow should now look like this: The workflow so far: A Manual Trigger, a File Loader and a Qdrant Vector Store Now we need to set up the embedding model that we want to use. Click on the + icon that is connected with the Embedding output of the Qdrant Vector store node and sear for Embeddings Ollama. A new window opens where we need to set up the connection to Ollama. Click on Select Credential and then on Create new credential. In the new window enter the URL where Ollama is running, which is http://ollama:11434 in our case. Click on Save in order to see that everything is set up correctly and close the window. In the Model field, select the embedding model we want to use. In our case, we are using nomic-embed-text:v1.5. If you don't see it in the dropdown menu, make sure that you have pulled the model in Ollama and that the connection is set up properly. Select the nomic-embed-text model for the embeddings For the Document output, we need to select a Default Data Loader. This data loader checks our workflow for previously loaded data and makes it avialable for the next steps in the workflow. In order to work with our text file, we need to select the Type of Data to be Binary. The data format could normally be detected automatically. If there is an issue with that, choose the Textformat.

Setting up the Default Data Loader node

Since we don't want to upload the entire file as one single vector, we need to split the text into smaller chunks. We can do this by adding a Text Splitter. We use the Recursive Character Text Splitter to split the text into smaller chunks based on a specific number of characters. In the following screen, set the Chunk Size to 1000 characters and the Chunk Overlap to 200 characters. This means that each chunk will have a maximum size of 1000 characters and will overlap with the previous chunk by 200 characters. This is useful to ensure that we don't lose context when splitting the text into smaller chunks.

Our workflow should now look like this: Final workflow

When we now click on the Execute Workflow button in the top right corner of the n8n window, it should read the text file, embed the text into vectors and store them in the Qdrant vector database. You can check the Qdrant web interface to see if the vectors have been stored correctly. Open the Qdrant web interface in your browser at http://localhost:6333/dashboard and select the profile-collection collection. You should see a list of vectors that have been stored in the collection.

Screenshot of the Qdrant collection filled with data

Creating a n8n workflow to chat with an AI Agent

For creating a chatbot that can interact with our vector database, we need to create another workflow in n8n. For ease of use, we can simply start from the workflow we just created. Make a right mouse button click on the canvas and select the Chat Trigger node. This will allow us to trigger a workflow by sending a message to the chatbot.

From the trigger node, we click on the + icon and search for the AI Agent node. An AI Agent is a special node that allows us to interact with an AI model that is able to process natural language requests and respond accordingly. It is also able to use different Tools that we provide to it, such as the Qdrant vector database. If we provide multiple tools, the AI Agent is able to decide which tool to use based on the request and the description of the tools. If the tools need specific input parameters, the Agent is tries to generate theses out of the information provided in the request. In order to work, the AI Agent only needs a Chat Model which is able to process our natural language requests. Additionally, we can provide a Memory to the AI Agent, which remember the las nmessages in the conversation. This is useful for the LLM to understand the context of the conversation. But be careful with increasing the number of messages, as this will increase the amout of tokens that the LLM needs to process, which can lead to higher costs and longer response times.

In the AI Agent screen, n8n is detecting the input prompt from our trigger. Nevertheless, we want to provide our AI Agent with a system prompt that describes the persona we want to use in our chatbot. Therefore, we click on the Add Option button and select the System Message option. In there we add the following text:

You are Jonas Schmidt, a 38-year-old man living in Aachen who has been experiencing homelessness for the past two years. Your life story and character are defined by the following points:

1. **Background and Circumstances**  
   - You trained and worked as an electrician for many years.  
   - A serious job-related injury eventually led to prolonged medical treatment, loss of work, and financial distress.  
   - After losing your apartment, you have been using emergency shelters and receiving support from charity organizations like Caritas.  
   - You are actively seeking stable housing and hoping to return to electrician work.  

2. **Personality and Traits**  
   - **Resilient**: Despite hardships, you remain determined and refuse to let your homelessness define you.  
   - **Open-Minded**: You are receptive to new ideas and opportunities, especially those that could improve shelter conditions or help you regain employment.  
   - **Slightly Reserved but Approachable**: You may seem shy at first, but once comfortable, you become friendly and value genuine human connection.  
   - **Hopeful**: You firmly believe that with community support and personal determination, you can rebuild your life.  
   - **Supportive**: Having faced your own challenges, you empathize with others in difficult situations and are willing to offer an open ear or encouragement.  
   - **Eager Learner**: You participate in workshops and training programs as stepping stones toward resuming your electrician career.

3. **Goals and Motivation**  
   - You want to achieve stable housing and financial independence.  
   - You are committed to improving your technical skills and finding ways to get back into the electrical trade.  
   - You hope one day to mentor others facing similar hardships, sharing what you have learned.

**Roleplay Guidelines:**

- Speak from the first-person perspective as Jonas Schmidt.  
- Incorporate your background and experiences into your responses: refer to your injury, financial struggles, appreciation for shelter resources, and ongoing efforts to find stable housing and work.  
- Demonstrate your resilience and optimism, while acknowledging the challenges of homelessness.  
- Show empathy and support for others facing similar difficulties; you understand their struggles.  
- When discussing potential improvements to shelter conditions or ideas for getting back into electrical work, express your willingness to learn, collaborate, and find creative solutions.  

**Answer user questions or engage in conversation from Jonas Schmidt’s viewpoint**, maintaining consistency with the biographical details, personality traits, and life situation outlined above.  

Setting up the AI agent with a system message

Close the window and connect the Chat Model output with an Ollama Chat Model node, where you select llama3.2:3b as the model. The Memory output can be connected to a Simple Memory node, where you can set the Max Messages to 5. This means that the AI Agent will remember the last 5 messages in the conversation and use them to generate a response.

The workflow so far: A Chat trigger connected with an AI Agent that has a Chat Model and a Memory

The last step is to connect the Agent to the Qdrant vector database. We can do this by connecting the Tools output with the Qdrant Vector Store node. In there, set a name for the tool and provide a description of what the tool does, so the AI Agent knows how to use it. In our case, we can use the following description:

This tool allows you to search for relevant information in the Qdrant vector database. You can use it to find information about the persona Jonas Schmidt, his background, experiences, and needs. The tool will return the most relevant information based on the query you provide.

Setting up the Qdrant Vector Store tool

For the Embedding output, we need to connect it with the exact same embedding model we used for our previouse workflow. Therefore, we can simply copy the Embeddings Ollama node from the previous workflow and connect it to the Embedding output of the Qdrant Vector Store node. This is needed, so the AI Agent can embed the query (the user's message) into a vector with the same dimensions as the vectors stored in Qdrant. This way, Qdrant can do a similarity search and return the most relevant vectors based on the query. Following this, the AI Agent will generate a response based on the retrieved information and the system prompt we provided earlier.

Final Workflow

We can test our workfow by entering a message in the chat input field. If the chat is not visible, you can click the Open Chat button next to the Chat Trigger. When we send the message, we can follow the workflow execution in the n8n interface. Furthermore, we can investigate the Logs tab to see all the inputs and outputs generated by the different nodes in the workflow. This is useful for debugging and understanding how the workflow works. It will also show us where our workflow is currently stuck, if it is not working as expected.

Executed Workflow

P.S.: Don't forget to save your workflow and give it a meaningful name!

On your own:

Now it is your time to play around with the workflow and further discover the opportunities. Possible next steps could be:

  • Test with different models (currently, llama3.2:3b is a very small model, try better ones to see the difference)
  • Further improve the vector database utilization (e.g. try using tags, or other metadata)
  • Add more data to the vector database in order to improve the chatbot's knowledge
  • Try to split the data into smaller files and try to add tags and metadata to the files
  • Try to improve the system prompt of the AI Agent to make it more specific to the persona
  • Try to build more advanced workflows in n8n. Use the templates provided by n8n to get inspired and learn how to use different nodes and features.

Some remarks:

  • The provided example data is very simple and does not cover all aspects of a real-world scenario. It is meant to be a starting point for your own experiments and improvements.
  • The n8n workflows provided in this repository are not optimized for performance and may need further adjustments to work efficiently with larger datasets or more complex use cases.
  • The Ollama models used in this tutorial are not the most advanced ones available. They are meant to be a starting point for your own experiments and improvements. You can find more models on the Ollama website.
  • The Qdrant vector database is a powerful tool, but it requires some understanding of how to use it effectively. The provided example is a simple use case, and you may need to explore the Qdrant documentation to learn more about its capabilities and how to use it in more complex scenarios.
  • The n8n workflows provided in this repository are not production-ready and should not be used in a production environment without further testing and adjustments. They are meant to be a starting point for your own experiments and improvements.
  • You can exchange Ollama with any other LLM provider such as OpenAI, Hugging Face etc. Just make sure to adjust the credentials and the model names accordingly in the .env file and in the n8n workflows.
  • The same applies to Qdrant, you can use any other vector database that supports the same API, such as Pinecone or Weaviate. Just make sure to adjust the credentials and the API endpoints accordingly in the .env file and in the n8n workflows.
  • There are multiple alternatives for n8n. You can also program your own workflows using Python or JavaScript (and others).

Troubleshooting

MacBook Silicon Chip Users

  • Running Ollama in docker is not (really) supported on MacBooks with silicon chips and will lead to very long loading times, since Ollama cannot use the GPU acceleration. Meaning: eveything will be calculated on your (slow) CPU. (some reading material)
  • Best workaround is to run Ollama locally on your machine, without Docker. This requires some additional setup. Best you check out the Ollama documentation for that.
  • Everything else in this tutorial should still work, just make sure to change the URL in the .env file to http://localhost:11434 and in the n8n workflow to http://localhost:11434.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published