A simple implementation of a Retrieval-Augmented Generation system using OpenAI's GPT-4o, sentence transformers, and FAISS for efficient similarity search.
This project demonstrates how to build a RAG system that:
- π Splits text into sentences for better granular retrieval
- π Creates embeddings using SentenceTransformers
- ποΈ Stores embeddings in a FAISS index for fast similarity search
- π Retrieves relevant context for user queries
- π€ Generates responses using OpenAI's GPT-4o with proper citations
- π Environment-based configuration - API keys stored securely in
.envfile - π Sentence-level retrieval - Fine-grained context retrieval using NLTK tokenization
- β‘ Efficient similarity search - FAISS indexing for fast vector search
- π Citation support - Generated responses include source citations
- π Confidence scoring - AI-generated confidence scores for answers
- π Modern OpenAI API - Compatible with OpenAI Python library v1.0+
- πPython 3.7+
- πOpenAI API key
- πJupyter Notebook or JupyterLab
-
πClone or download this repository
-
π¦Install required packages:
pip install openai sentence-transformers faiss-cpu nltk tiktoken python-dotenv
-
πSet up your environment variables:
Create a
.envfile in the project directory:OPENAI_API_KEY=sk-your-actual-api-key-hereβ οΈ Important: Replacesk-your-actual-api-key-herewith your actual OpenAI API key.
-
Open the Jupyter notebook: π
jupyter notebook RAG.ipynb
-
β‘Execute the cells in order:
- π¦Cell 1: Install dependencies
- π§Cell 2: Import libraries and load environment variables
- πCell 3: Process text (split, embed, index)
- πCell 4: Define retrieval function
- π€Cell 5: Define answer generation function
- βCell 6: Test with example query
query = "What is RAG and how does it reduce hallucinations?"
print(generate_answer(query))β¨Expected Output:
Retrieval-Augmented Generation (RAG) is a method that improves large language models by allowing them to retrieve information from external documents [1]. It reduces hallucinations by first looking up relevant information and then generating a response using that retrieved context [3]. This approach makes the answers more fact-based [2].
Confidence Score: 1
- Tokenization: Uses NLTK's sentence tokenizer to split documents into sentences π
- Embedding: Converts each sentence to a vector using
all-MiniLM-L6-v2model π’
- FAISS Index: Creates an efficient L2 distance-based index for similarity search π―
- Storage: Embeddings are stored in memory for fast retrieval πΎ
- Query Embedding: User queries are converted to the same vector space π
- Similarity Search: FAISS finds the most similar sentences π―
- Ranking: Results are ranked by cosine similarity distance π
- Context Assembly: Retrieved sentences are formatted with citation numbers π
- Prompt Engineering: Structured prompt guides the AI to cite sources and provide confidence π―
- Response Generation: OpenAI GPT-4o generates the final answer β¨
RAG/
βββ RAG.ipynb # Main Jupyter notebook with implementation π
βββ .env # Environment variables (API keys) π
βββ README.md # This file π
βββ requirements.txt # Python dependencies (optional) π¦
openai- OpenAI API client for GPT-4o π€sentence-transformers- For creating sentence embeddings π’faiss-cpu- Efficient similarity search and clustering π―nltk- Natural language processing toolkit πpython-dotenv- Environment variable management πtiktoken- OpenAI's tokenization library β‘
retrieve_sentences(query, top_k=4)- Retrieves most relevant sentences πformat_context(snippets)- Formats retrieved sentences with citations πgenerate_answer(query)- Generates final answer with citations and confidence π€
embedder = SentenceTransformer("your-preferred-model")top_snippets = retrieve_sentences(query, top_k=6) # Retrieve more contextEdit the prompt in the generate_answer function to change response style or requirements.
- π API keys are stored in
.envfile (not in code) - π Add
.envto your.gitignorefile - β Never commit API keys to version control
- π Use environment variables in production
-
NLTK Data Missing π
LookupError: Resource punkt_tab not foundSolution: The notebook automatically downloads required NLTK data β
-
OpenAI API Error π€
APIRemovedInV1: You tried to access openai.ChatCompletionSolution: Code uses the new OpenAI v1.0+ API syntax β
-
Missing API Key π
ValueError: OPENAI_API_KEY not found in environment variablesSolution: Check your
.envfile and ensure the API key is correct β
- For larger documents, consider chunking strategies π
- Use GPU-accelerated FAISS for better performance with large datasets β‘
- Consider caching embeddings for frequently accessed documents πΎ
This project is open source and available under the MIT License.
Feel free to submit issues, feature requests, or pull requests to improve this RAG implementation.
- OpenAI for GPT-4o π€
- Sentence Transformers for embedding models π’
- FAISS for efficient similarity search π―
- NLTK for text processing π