Welcome to the LLM-RAG App repository! This project leverages ZenML for building robust pipelines and integrates OpenAI's GPT-4o for intelligent document analysis. The app is designed to allow users to upload PDF files, perform document retrieval, and ask dynamic questions—including generating summaries.
- Features
- Installation Guide
- Project Structure
- Pipeline Overview
- Components Breakdown
- Workflow Explanation
- Usage Instructions
- Troubleshooting
- Contributing
- License
- Upload and analyze PDF documents.
- Dynamic Q&A based on document content.
- Summarization of documents using GPT-4o.
- Built on a robust ZenML pipeline.
git clone https://github.com/yourusername/llm-rag-app.git
cd llm-rag-app
Ensure you have Python 3.11+ installed.
pip install poetry
poetry install
poetry shell
Create a .env
file in the root directory:
OPENAI_API_KEY=your_openai_api_key_here
poetry run uvicorn app.main:app --reload
Visit: http://127.0.0.1:8000/docs
llm-rag-app/
├── app/
│ ├── main.py # FastAPI app entry point
│ ├── retriever.py # Handles document retrieval with FAISS
│ ├── generator.py # Generates answers using OpenAI GPT-4o
│ ├── rag_pipeline.py # ZenML pipeline orchestration
├── .env # API keys
├── README.md # Project documentation
├── poetry.lock # Poetry dependencies lock file
└── pyproject.toml # Project dependencies and configurations
The application follows the RAG (Retrieval-Augmented Generation) pattern, orchestrated by ZenML.
- File Upload: User uploads a PDF file via the FastAPI endpoint.
- Retriever Step: Extracts and splits the text using FAISS.
- Context Combination: Merges relevant document chunks.
- Answer Generation: Uses OpenAI GPT-4o for generating responses.
- Response Delivery: Returns the response to the user.
- Extracts text from PDFs.
- Splits text into manageable chunks.
- Builds FAISS index for efficient document retrieval.
- Uses OpenAI's GPT-4o.
- Dynamically adjusts token limits.
- Handles continuation of responses to avoid truncation.
- Retriever Step: Loads document chunks.
- Context Step: Combines text chunks into a unified context.
- Generation Step: Passes the context and question to GPT-4o.
- Provides API endpoints for file upload and question submission.
- Manages error handling and response formatting.
- Endpoint Triggered:
POST /ask
inmain.py
- Inputs:
- PDF file: Uploaded by the user.
- Question: Submitted through the form.
- Temporary File Creation:
- The uploaded PDF is saved temporarily on the server.
- Pipeline Trigger:
- A unique run ID is generated using
uuid
. - The
rag_pipeline
function is called with the PDF file path and the user's question as arguments.
- A unique run ID is generated using
The pipeline orchestrates three main steps:
-
Retriever Step (
retriever.py
):- PDF Reading: Extracts the text content from the uploaded PDF.
- Text Splitting: Breaks the text into smaller chunks for efficient processing.
- Embedding Generation: Converts text chunks into numerical vector representations (embeddings).
- FAISS Indexing: Creates an index using FAISS for fast retrieval of relevant chunks based on similarity to the question.
-
Context Combination Step:
- Query Processing: Takes the user's question and searches the FAISS index for the most relevant text chunks.
- Context Assembly: Combines these chunks into a single context string to be used by the language model.
-
Generation Step (
generator.py
):- Prompt Creation: Constructs a prompt using the retrieved context and the user's question.
- Answer Generation: Sends the prompt to OpenAI’s GPT-4 model to generate a comprehensive, context-aware answer.
- Pipeline Monitoring:
- The app waits for the pipeline to complete and fetches the output of the
generation_step
.
- The app waits for the pipeline to complete and fetches the output of the
- Artifact Loading:
- The generated answer is retrieved from the output artifact using
BuiltInMaterializer
.
- The generated answer is retrieved from the output artifact using
- Response:
- The generated answer is returned to the user as a JSON response.
- File Cleanup:
- The temporary PDF file is deleted from the server to save space.
main.py
: Handles the request, saves the PDF, and triggers the pipeline.rag_pipeline.py
: Orchestrates the pipeline steps.retriever.py
: Extracts, splits, and indexes PDF content with FAISS.generator.py
: Generates an answer using the retrieved content and GPT-4.main.py
: Retrieves the final output, sends it back to the user, and cleans up.
- Access API Docs: http://127.0.0.1:8000/docs
- Upload a PDF: Use the
/ask
endpoint to upload a PDF. - Ask a Question: Enter a question like:
- "Summarize this document."
- "What are the key findings of the study?"
- "Which species migrates earlier?"
- Receive Response: The app processes the document and returns the answer.
- Common Errors:
Invalid 'max_tokens'
: Adjust token calculation logic.Server Error: ArtifactVersionResponse
: Ensure ZenML artifact loading is correct.
- Debugging Tips:
- Use
traceback
for detailed error logs. - Check
.env
file for API key issues.
- Use
- Fork the repository.
- Create a new branch:
git checkout -b feature-branch
- Make changes and commit:
git commit -m "Add new feature"
- Push and create a pull request.
This project is licensed under the MIT License.
Built with ❤️ using FastAPI, ZenML, and OpenAI GPT-4o.