📚 LLM-RAG App Documentation

Welcome to the LLM-RAG App repository! This project leverages ZenML for building robust pipelines and integrates OpenAI's GPT-4o for intelligent document analysis. The app is designed to allow users to upload PDF files, perform document retrieval, and ask dynamic questions—including generating summaries.

✨ Features

Upload and analyze PDF documents.
Dynamic Q&A based on document content.
Summarization of documents using GPT-4o.
Built on a robust ZenML pipeline.

⚙️ Installation Guide

1️⃣ Clone the Repository:

git clone https://github.com/yourusername/llm-rag-app.git
cd llm-rag-app

2️⃣ Set Up the Environment:

Ensure you have Python 3.11+ installed.

pip install poetry
poetry install
poetry shell

3️⃣ Configure Environment Variables:

Create a .env file in the root directory:

OPENAI_API_KEY=your_openai_api_key_here

4️⃣ Run the App:

poetry run uvicorn app.main:app --reload

Visit: http://127.0.0.1:8000/docs

🗂️ Project Structure

llm-rag-app/
├── app/
│   ├── main.py          # FastAPI app entry point
│   ├── retriever.py     # Handles document retrieval with FAISS
│   ├── generator.py     # Generates answers using OpenAI GPT-4o
│   ├── rag_pipeline.py  # ZenML pipeline orchestration
├── .env                 # API keys
├── README.md            # Project documentation
├── poetry.lock          # Poetry dependencies lock file
└── pyproject.toml       # Project dependencies and configurations

🧪 Pipeline Overview

The application follows the RAG (Retrieval-Augmented Generation) pattern, orchestrated by ZenML.

🔄 Workflow:

File Upload: User uploads a PDF file via the FastAPI endpoint.
Retriever Step: Extracts and splits the text using FAISS.
Context Combination: Merges relevant document chunks.
Answer Generation: Uses OpenAI GPT-4o for generating responses.
Response Delivery: Returns the response to the user.

🧩 Components Breakdown

1️⃣ Retriever (`retriever.py`):

Extracts text from PDFs.
Splits text into manageable chunks.
Builds FAISS index for efficient document retrieval.

2️⃣ Generator (`generator.py`):

Uses OpenAI's GPT-4o.
Dynamically adjusts token limits.
Handles continuation of responses to avoid truncation.

3️⃣ ZenML Pipeline (`rag_pipeline.py`):

Retriever Step: Loads document chunks.
Context Step: Combines text chunks into a unified context.
Generation Step: Passes the context and question to GPT-4o.

4️⃣ FastAPI (`main.py`):

Provides API endpoints for file upload and question submission.
Manages error handling and response formatting.

🔗 Workflow Explanation

📥 1. User Uploads a PDF and Submits a Question

Endpoint Triggered: POST /ask in main.py
Inputs:
- PDF file: Uploaded by the user.
- Question: Submitted through the form.

🚀 2. FastAPI Handles the Request (`main.py`)

Temporary File Creation:
- The uploaded PDF is saved temporarily on the server.
Pipeline Trigger:
- A unique run ID is generated using uuid.
- The rag_pipeline function is called with the PDF file path and the user's question as arguments.

🔄 3. RAG Pipeline Starts (`rag_pipeline.py`)

The pipeline orchestrates three main steps:

Retriever Step (retriever.py):
- PDF Reading: Extracts the text content from the uploaded PDF.
- Text Splitting: Breaks the text into smaller chunks for efficient processing.
- Embedding Generation: Converts text chunks into numerical vector representations (embeddings).
- FAISS Indexing: Creates an index using FAISS for fast retrieval of relevant chunks based on similarity to the question.
Context Combination Step:
- Query Processing: Takes the user's question and searches the FAISS index for the most relevant text chunks.
- Context Assembly: Combines these chunks into a single context string to be used by the language model.
Generation Step (generator.py):
- Prompt Creation: Constructs a prompt using the retrieved context and the user's question.
- Answer Generation: Sends the prompt to OpenAI’s GPT-4 model to generate a comprehensive, context-aware answer.

📤 4. Retrieving the Pipeline Output (`main.py`)

Pipeline Monitoring:
- The app waits for the pipeline to complete and fetches the output of the generation_step.
Artifact Loading:
- The generated answer is retrieved from the output artifact using BuiltInMaterializer.

🗑️ 5. Response & Cleanup

Response:
- The generated answer is returned to the user as a JSON response.
File Cleanup:
- The temporary PDF file is deleted from the server to save space.

📝 Summary of the Flow

main.py: Handles the request, saves the PDF, and triggers the pipeline.
rag_pipeline.py: Orchestrates the pipeline steps.
retriever.py: Extracts, splits, and indexes PDF content with FAISS.
generator.py: Generates an answer using the retrieved content and GPT-4.
main.py: Retrieves the final output, sends it back to the user, and cleans up.

📥 Usage Instructions

Access API Docs: http://127.0.0.1:8000/docs
Upload a PDF: Use the /ask endpoint to upload a PDF.
Ask a Question: Enter a question like:
- "Summarize this document."
- "What are the key findings of the study?"
- "Which species migrates earlier?"
Receive Response: The app processes the document and returns the answer.

🛠️ Troubleshooting

Common Errors:
- Invalid 'max_tokens': Adjust token calculation logic.
- Server Error: ArtifactVersionResponse: Ensure ZenML artifact loading is correct.
Debugging Tips:
- Use traceback for detailed error logs.
- Check .env file for API key issues.

🤝 Contributing

Fork the repository.
Create a new branch:
```
git checkout -b feature-branch
```
Make changes and commit:
```
git commit -m "Add new feature"
```
Push and create a pull request.

📜 License

This project is licensed under the MIT License.

Built with ❤️ using FastAPI, ZenML, and OpenAI GPT-4o.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.zen		.zen
app		app
faiss_index		faiss_index
mlruns/0		mlruns/0
tests		tests
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
test.txt		test.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📚 LLM-RAG App Documentation

🚀 Table of Contents

✨ Features

⚙️ Installation Guide

1️⃣ Clone the Repository:

2️⃣ Set Up the Environment:

3️⃣ Configure Environment Variables:

4️⃣ Run the App:

🗂️ Project Structure

🧪 Pipeline Overview

🔄 Workflow:

🧩 Components Breakdown

1️⃣ Retriever (`retriever.py`):

2️⃣ Generator (`generator.py`):

3️⃣ ZenML Pipeline (`rag_pipeline.py`):

4️⃣ FastAPI (`main.py`):

🔗 Workflow Explanation

📥 1. User Uploads a PDF and Submits a Question

🚀 2. FastAPI Handles the Request (`main.py`)

🔄 3. RAG Pipeline Starts (`rag_pipeline.py`)

📤 4. Retrieving the Pipeline Output (`main.py`)

🗑️ 5. Response & Cleanup

📝 Summary of the Flow

📥 Usage Instructions

🛠️ Troubleshooting

🤝 Contributing

📜 License

About

Releases

Packages

Languages

drbn68/llm-rag-pipeline

Folders and files

Latest commit

History

Repository files navigation

📚 LLM-RAG App Documentation

🚀 Table of Contents

✨ Features

⚙️ Installation Guide

1️⃣ Clone the Repository:

2️⃣ Set Up the Environment:

3️⃣ Configure Environment Variables:

4️⃣ Run the App:

🗂️ Project Structure

🧪 Pipeline Overview

🔄 Workflow:

🧩 Components Breakdown

1️⃣ Retriever (retriever.py):

2️⃣ Generator (generator.py):

3️⃣ ZenML Pipeline (rag_pipeline.py):

4️⃣ FastAPI (main.py):

🔗 Workflow Explanation

📥 1. User Uploads a PDF and Submits a Question

🚀 2. FastAPI Handles the Request (main.py)

🔄 3. RAG Pipeline Starts (rag_pipeline.py)

📤 4. Retrieving the Pipeline Output (main.py)

🗑️ 5. Response & Cleanup

📝 Summary of the Flow

📥 Usage Instructions

🛠️ Troubleshooting

🤝 Contributing

📜 License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

1️⃣ Retriever (`retriever.py`):

2️⃣ Generator (`generator.py`):

3️⃣ ZenML Pipeline (`rag_pipeline.py`):

4️⃣ FastAPI (`main.py`):

🚀 2. FastAPI Handles the Request (`main.py`)

🔄 3. RAG Pipeline Starts (`rag_pipeline.py`)

📤 4. Retrieving the Pipeline Output (`main.py`)

Packages