Scan htmls and other local files without sending any data outside.
This is a simple LLM-powered document search system. It uses Nomic embeddings and Qdrant vector database to store and search documents. Then it uses Ollama to generate answers. You can search through html documents.
QDRANT_URL=http://localhost:6334 EMBEDDINGS_API_URL=http://localhost:11434/api/embeddings COUCHDB_URL=http://admin:password@localhost:5984
if you run those services in docker, you might use the same variables. Put them in .env file.
The system works in several stages:
- Document Processing: Documents are processed and stored in CouchDB
- Embedding Generation: Nomic API generates embeddings for documents
- Vector Storage: Embeddings are stored in Qdrant with metadata
- Question Answering:
- Generates embedding for the question
- Finds similar documents using vector search
- Extracts relevant context
- Uses LLM to generate precise answers
- Falls back to full document search in blocks if needed
RUST_LOG="info" cargo run --release
MIT License