This application is designed to handle queries using a language model and a vector database. It generates multiple versions of a user query to retrieve relevant documents and provides answers based on the retrieved context.
- Python 3: Ensure you have Python 3.x installed.
- Ollama: This app requires Ollama to be installed and running locally. Follow the Ollama installation guide to set it up.
- Clone the repository:
$ git clone https://github.com/firstpersoncode/local-rag.git
$ cd local-rag
- Create a virtual environment:
$ python -m venv venv
$ source venv/bin/activate
# For Windows user
# venv\Scripts\activate
- Install dependencies:
$ pip install -r requirements.txt
- Run Ollama: Ensure Ollama is installed and running locally. Refer to the Ollama documentation for setup instructions.
- Start Ollama
$ ollama serve
- Install llm model
$ ollama pull mistral
- Install text embedding model
$ ollama pull nomic-embed-text
- Set up environment variables: Edit the .env.sample file and save it as .env
TEMP_FOLDER = './_temp'
CHROMA_PATH = "chroma"
COLLECTION_NAME = 'local-rag'
LLM_MODEL = 'mistral' # replace with the model you want to use.
TEXT_EMBEDDING_MODEL = 'nomic-embed-text'
$ python app.py
Load the documents into the database.
$ curl --request POST \
--url http://localhost:8080/embed \
--header 'Content-Type: multipart/form-data' \
--form file=@/path/to/pdf/document.pdf
# Response
{
"message": "File embedded successfully"
}
Ask questions about the documents.
$ curl --request POST \
--url http://localhost:8080/query \
--header 'Content-Type: application/json' \
--data '{ "query": "What is the document about?" }'
# Response
{
"message": "The document is about...",
}
This app leverages a language model and a vector database to provide enhanced query handling capabilities. Ensure Ollama is running locally and follow the setup instructions to get started.