GitHub - Khadka-Bishal/DeepRecall: A RAG platform with an event-driven AWS serverless ingestion pipeline and a hybrid retrieval engine

DeepRecall is a hybrid Retrieval-Augmented Generation (RAG) system that uses AWS Lambda for document processing. It separates the "Write Path" (heavy lifting) from the "Read Path" (latency-sensitive chat).

Technical Overview

Hybrid Search: Combines BM25 (keyword) and Dense Vector search with Reciprocal Rank Fusion (RRF).
Architecture:
- Ingestion (Write): Event-driven AWS Pipeline (S3 → Lambda → Pinecone).
- Retrieval (Read): FastAPI backend querying Pinecone directly.
Infrastructure as Code: Fully provisioned AWS environment (IAM, S3, Lambda, SQS) using Terraform modules.

Demo

Demo.mov

System Architecture

High-Level Architecture

The system is deployed across a hybrid environment, with infrastructure managed via Terraform.

graph TD
    subgraph "Client & Backend Tier"
        direction LR
        C["React + Vite Frontend"] <-->|REST / WebSocket| B["FastAPI Backend"]
        B <-->|Query| P["Pinecone Vector DB"]
    end

    subgraph "AWS Serverless (Terraform)"
        direction LR
        S["S3 Buckets"] -->|"Event Notification"| SQS["SQS Queue"] -->|"Async Trigger"| L["ADE Processor Lambda"]
    end

    B -->|"Secure Upload"| S
    L -->|"Semantic Chunking"| P

Detailed Data Flow

The system implements an Asynchronous Write Path (Upload -> S3 -> Lambda -> Pinecone) and a Synchronous Read Path (User -> API -> Pinecone).

View Sequence Diagram

sequenceDiagram
    participant U as User
    participant B as Backend
    participant S3 as AWS S3
    participant L as Lambda (ADE)
    participant P as Pinecone

    Note over U, P: 1. Write Path (Ingestion)
    U->>B: Upload PDF
    B->>S3: Upload File
    S3->>L: Trigger Processing
    L->>P: Index Vectors
    B->>S3: Poll Status
    
    Note over U, P: 2. Read Path (Retrieval)
    U->>B: Query
    B->>P: Hybrid Search
    P-->>B: Relvant Chunks
    B->>U: Stream Answer

Ingestion Pipeline (Write Path)

Handles high-scale document processing using an event-driven serverless architecture:

graph TD
    subgraph Ingest ["Ingestion: PDF -> S3 -> Lambda -> DB"]
        direction LR
        PDF[File] --> S3[S3 Input] --> L[Lambda]
        L --> P[ADE Parser] --> C[Chunker] --> E[Embedding]
        E --> DB[(Pinecone)]
        E --> Out[S3 Out]
    end

Retrieval Pipeline (Read Path)

DeepRecall employs a "Fusion Retrieval" strategy to ensure high recall and precision:

graph TD
    subgraph Retrieval ["Read Path: Query -> Hybrid Search -> Rerank -> LLM"]
        direction LR
        Q[Query] --> ME[Expansion] --> P[Parallel]
        P --> V[Vector Search] & K[BM25 Search] -->|"Top N Chunks"| RRF[Fusion]
        RRF -->|"Top N"| CE[Rerank] -->|"Top K"| LLM[Generate] --> A[Stream]
    end

Quick Start

Backend: pip install -r backend/requirements.txt && python backend/server.py
Frontend: npm install && npm run dev

Observability

Integrated with LangSmith (LLM tracing) and Weights & Biases (performance metrics).

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github/workflows		.github/workflows
backend		backend
public		public
scripts		scripts
src		src
terraform		terraform
.env.example		.env.example
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.prettierrc		.prettierrc
README.md		README.md
index.html		index.html
index.tsx		index.tsx
metadata.json		metadata.json
package-lock.json		package-lock.json
package.json		package.json
requirements.py		requirements.py
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Technical Overview

Demo

System Architecture

High-Level Architecture

Detailed Data Flow

Ingestion Pipeline (Write Path)

Retrieval Pipeline (Read Path)

Quick Start

Observability

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Technical Overview

Demo

System Architecture

High-Level Architecture

Detailed Data Flow

Ingestion Pipeline (Write Path)

Retrieval Pipeline (Read Path)

Quick Start

Observability

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages