A lightweight RAG (Retrieval-Augmented Generation) system built from scratch without LangChain. Enables semantic search and intelligent question-answering over document collections using Claude AI and ChromaDB.
Coming Soon: Deploy link will be added after initial deployment
I built this while exploring production RAG patterns for enterprise applications. Every company is racing to unlock knowledge trapped in documents, and I wanted to understand the full stackβfrom chunking strategies to deploymentβwithout relying on heavy frameworks like LangChain.
Key learnings:
- Chunking strategies significantly impact retrieval quality
- Source citation is critical for enterprise trust
- Direct API integration gives better control than abstraction layers
- Proper error handling matters more than perfect embeddings
What this demonstrates:
- Production-ready RAG from scratch (no LangChain)
- Custom chunking and retrieval pipeline
- Direct Claude API integration
- Clean, maintainable code patterns
- End-to-end deployment with Docker
- π PDF Document Processing: Upload and index PDF documents
- π Semantic Search: Find relevant information using natural language
- π€ AI-Powered Answers: Get accurate responses backed by your documents
- π Source Citations: See exactly which documents and pages informed each answer
- π― Relevance Scoring: Understand confidence levels for retrieved information
- π Fast Retrieval: Optimized vector search with ChromaDB
Custom RAG Pipeline (No LangChain)
βββββββββββββββ
β User β
β Query β
ββββββββ¬βββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββ
β Streamlit UI Layer β
ββββββββ¬ββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββ
β Custom RAG Pipeline β
β β
β ββββββββββββ ββββββββββββββββ β
β β PyPDF βββββΆβ Custom β β
β β Loader β β Chunker β β
β ββββββββββββ ββββββββ¬ββββββββ β
β β β
β βΌ β
β ββββββββββββββββββββ β
β β ChromaDB β β
β β (embeddings + β β
β β vector store) β β
β ββββββββββββ¬ββββββββ β
β β β
β βββββββββββββββββ΄βββββββ β
β βΌ β β
β ββββββββββββββββ β β
β β Semantic β β β
β β Search β β β
β β (Top-K) β β β
β ββββββββ¬ββββββββ β β
β β β β
β βΌ β β
β ββββββββββββββββ β β
β β Direct β β β
β β Claude API βββββββββββββββ β
β ββββββββ¬ββββββββ β
βββββββββββΌββββββββββββββββββββββββ
β
βΌ
βββββββββββ
β Answer β
β + β
β Sources β
βββββββββββ
| Component | Technology | Purpose |
|---|---|---|
| LLM | Claude 3.5 Sonnet | Response generation |
| Vector Store | ChromaDB | Semantic search & embeddings |
| Document Processing | PyPDF | PDF text extraction |
| UI | Streamlit | Web interface |
| Language | Python 3.11+ | Core implementation |
Note: Built without LangChain - direct API integration for full control and minimal dependencies.
- Python 3.11 or higher
- Anthropic API key (Get one here)
- (Optional) Voyage AI or OpenAI API key for embeddings
-
Clone the repository
git clone https://github.com/yourusername/enterprise-doc-qa.git cd enterprise-doc-qa -
Create virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Configure environment variables
cp .env.example .env # Edit .env and add your API keys -
Run the application
streamlit run src/ui/app.py
-
Open your browser Navigate to
http://localhost:8501
# Build the image
docker build -t doc-qa-system .
# Run the container
docker run -p 8501:8501 --env-file .env doc-qa-systemOnce you've uploaded documents, try questions like:
- "What are the key terms of the contract?"
- "Summarize the main findings from the research report"
- "What security measures are mentioned?"
- "Compare the pricing models discussed"
- "What are the project timelines?"
For best results:
- Upload well-structured PDFs (avoid scanned images without OCR)
- Keep documents focused on a specific domain
- Ask specific questions rather than broad queries
- Review source citations to verify accuracy
enterprise-doc-qa/
βββ src/
β βββ components/
β β βββ document_loader.py # PDF processing
β β βββ chunking.py # Text splitting logic
β β βββ embeddings.py # Vector generation
β β βββ retrieval.py # RAG chain implementation
β βββ ui/
β βββ app.py # Streamlit interface
βββ tests/
β βββ test_chunking.py
β βββ test_retrieval.py
βββ data/ # Sample documents (gitignored)
βββ docs/ # Additional documentation
βββ .env.example # Environment template
βββ .gitignore
βββ requirements.txt
βββ Dockerfile
βββ README.md
Key environment variables in .env:
# Required
ANTHROPIC_API_KEY=your_claude_api_key
# Optional (for embeddings)
VOYAGE_API_KEY=your_voyage_key
OPENAI_API_KEY=your_openai_key
# Tuning parameters
CHUNK_SIZE=1000
CHUNK_OVERLAP=200
TOP_K_RESULTS=4- PDF only (no DOCX, TXT, HTML yet)
- No multi-document comparison
- Chat history not persisted across sessions
- English language only
- Add support for DOCX, TXT, Markdown files
- Implement conversation memory
- Add authentication and multi-user support
- Hybrid search (keyword + semantic)
- Export Q&A history
- Advanced chunking strategies (semantic splitting)
- Custom embedding fine-tuning
# Run all tests
pytest
# Run with coverage
pytest --cov=src tests/
# Run specific test file
pytest tests/test_chunking.pyBenchmarks (on M1 Mac, 100-page PDF):
- Document processing: ~15 seconds
- Query response time: ~2-3 seconds
- Embedding generation: ~5 seconds (cached afterward)
Contributions welcome! Please:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
MIT License - see LICENSE file for details
- Powered by Anthropic Claude
- Vector search by ChromaDB
- PDF processing by PyPDF
- UI framework by Streamlit
Sandeep Uppalapati
- LinkedIn: linkedin.com/in/sandeep-uppalapati
- GitHub: @sandeepuppalapati
- Project: github.com/sandeepuppalapati/enterprise-doc-qa
Note: This is a demonstration project. For production use, add proper authentication, rate limiting, and security measures.



