NITISH-R-G · Prithic · Apr 4, 2026
diff --git a/README.md b/README.md
@@ -1,41 +1,91 @@
-# Intelli-Credit - Intelligent Corporate Underwriting
+# 🏦 Intelli-Credit - Intelligent Corporate Underwriting
 
-> An autonomous AI Credit Officer designed to simulate how Tier-1 bank credit committees operate.
+> An autonomous AI Credit Officer platform designed to automate Tier-1 bank credit committee operations, leveraging Multi-Agent architecture, specialized Financial LLMs, and explainable risk scoring.
 
-Intelli-Credit is an end-to-end B2B credit decisioning platform. It ingests structured and unstructured borrower financial data, conducts autonomous web-scale due diligence, computes an explainable composite risk score using Machine Learning ensembles, simulates stress tests evaluating RAROC capital impact, and automatically generates a structured, downloadable Credit Appraisal Memo (CAM) in PDF format.
+Intelli-Credit is an end-to-end B2B credit decisioning engine. It ingests complex financial documents, conducts autonomous due diligence via specialized agents, computes risk scores using machine learning ensembles, and generates professional Credit Appraisal Memos (CAM).
 
-## 🔥 Key Differentiators
+---
 
-1.  **"AI Credit Officer" Persona & LLM Integration**: The system isn't just a traditional ML pipeline. It uses **Google Gemini** to extract unstructured data, read financial PDFs, and automatically author conversational narrative summaries for risk and compliance.
-2.  **Web-Scale Research Simulation**: Includes NLP sentiment analysis (using FinBERT), regulatory filings intelligence, and ESG scores.
-3.  **Modular Decision Studio**: A dynamic workflow engine that allows credit risk managers to visually construct underwriting logic, configure dynamic scoring rules, and trigger external webhooks interactively.
-4.  **Capital Impact (RAROC) Simulation**: Elevates from basic scoring to bank portfolio management by assessing Risk-Weighted Assets (RWA) and tier capital requirements.
-5.  **SHAP-Based Explainability**: Avoids "black box" models. The top contributing risk drivers are extracted for every decision.
+## 🏗 Project Architecture
 
-## 🏗 System Architecture
+Intelli-Credit is built with a modern, decoupled architecture designed for high performance and scalability.
 
-The project consists of a Python FastAPI backend acting as the Machine Learning, LLM, and pipeline orchestration layer, paired with a modern Next.js frontend featuring real-time state synchronization, drag-and-drop workflow canvases, and Firebase authentication.
+- **Frontend**: A highly interactive **Next.js 16** (App Router) application built with **React 19** and **Tailwind CSS 4**. It features a "Decision Studio" built on **XYFlow** for visual policy orchestration.
+- **Backend**: A high-performance **FastAPI** service powered by **Python 3.11+**, utilizing **Async SQLAlchemy** for non-blocking database operations and **Google Gemini 1.5 Flash** for document intelligence.
+- **AI Ecosystem**: Employs a multi-tier AI strategy using **Camel-AI** for multi-agent workflows, **Mem0** for persistent agent memory, and **XGBoost/SHAP** for transparent credit scoring.
+- **Data Layer**: Integrates with **Databricks SQL Warehouse** for enterprise-grade data ingestion and **FAISS** for vector search capabilities.
 
-### Core Modules
-*   **Ingestion Engine**: Parses financial PDFs, Bureau JSONs, and Bank Statement CSVs (using Gemini Vision & regex).
-*   **Dynamic Scorer & Rules Engine**: Evaluates nested risk rules built via the Decision Studio UI.
-*   **LLM Research Agent**: Leverages a LangChain-powered agent to perform RAG-based Vector Search and external intelligence aggregation. 
-*   **Risk Synthesis**: Combines ML Probability of Default (Gradient Boosting), qualitative LLM summaries, and macro-economic factors.
+---
 
-## 🚀 Quick Start (Local Development)
+## 📂 Project Structure
+
+```bash
+intelli-credit/
+├── frontend/                # Next.js 16 + React 19 Frontend
+│   ├── src/
+│   │   ├── app/             # App Router pages and layouts
+│   │   ├── components/      # Reusable UI components (Tailwind 4)
+│   │   ├── store/           # Zustand state management
+│   │   └── hooks/           # Custom React hooks
+├── backend/                 # FastAPI + Python Backend
+│   ├── modules/             # Core logic: Ingestion, Scoring, Agents
+│   ├── routers/             # API endpoints (V2 Async supported)
+│   ├── schemas/             # Pydantic data models
+│   ├── database/            # SQLAlchemy models and migrations
+│   ├── security/            # Firebase Auth integration
+│   └── training/            # ML model training scripts
+├── docker-compose.yml       # Container orchestration
+└── architecture.md          # Technical Deep-Dive
+```
+
+---
+
+## 🔥 Key Features
+
+### 1. 🧠 Intelligent Ingestion Engine
+Uses **Gemini 1.5 Flash** & **OCR (Tesseract/pdfplumber)** to extract structured financial data from messy, scanned Indian corporate PDFs, including:
+- Schedule III Balance Sheets & Profit/Loss statements.
+- GST Filings (GSTR-1, 3B) linked to **Databricks**.
+- Bank Statements with automated transaction categorization.
+
+### 2. 🎨 Decision Studio (Visual Policy Engine)
+A drag-and-drop canvas powered by **XYFlow** that allows risk managers to:
+- Build nested credit policies without writing code.
+- Trigger external webhooks and data integrations.
+- Define dynamic rules using a secure Python AST execution engine.
+
+### 3. 🤖 Multi-Agent Due Diligence
+Deploys a swarm of autonomous agents using **Camel-AI** and **Mem0**:
+- **Searcher Agent**: Conducts web-scale adverse media and regulatory searches.
+- **Analyst Agent**: Synthesizes financial ratios and macro-economic factors.
+- **Summarizer Agent**: Authors high-quality narrative commentary for the CAM.
+
+### 4. 📊 Explainable Risk Scoring (SHAP)
+Avoids "black box" decisions by providing full transparency:
+- **XGBoost Ensembles**: Predicts probability of default with high accuracy.
+- **SHAP Interpretability**: Visualizes exactly which factors (e.g., DSCR, Current Ratio) drove the final decision.
+- **RAROC Simulation**: Estimates Risk-Adjusted Return on Capital and capital impact.
+
+---
+
+## 🛠 Tech Stack
+
+- **Frontend**: Next.js 16, React 19, Tailwind CSS 4, XYFlow, Zustand, Recharts, Framer Motion.
+- **Backend**: FastAPI, SQLAlchemy 2.0, Pydantic, ReportLab, Celery (Optional).
+- **AI/ML**: Google Gemini 1.5 Flash, XGBoost, SHAP, Camel-AI, Mem0, FinBERT (Sentiment).
+- **Data & Auth**: PostgreSQL/SQLite, Databricks, Firebase Auth (Identity Platform).
+
+---
+
+## 🚀 Quick Start
 
 ### 1. Backend Setup
 ```bash
 cd backend
 python -m venv venv
-venv\Scripts\activate  # On Windows
+source venv/bin/activate  # venv\Scripts\activate on Windows
 pip install -r requirements.txt
-```
-*Note: A `.env` file is required in the backend containing your `GEMINI_API_KEY`, `POSTGRES_USER`, and database strings for Alembic migrations.*
-
-Run the FastAPI server:
-```bash
-uvicorn main:app --reload --port 8000
+uvicorn main:app --reload
 ```
 
 ### 2. Frontend Setup
@@ -44,24 +94,8 @@ cd frontend
 npm install
 npm run dev
 ```
-*Note: Ensure your `.env.local` contains valid Firebase configuration keys (`NEXT_PUBLIC_FIREBASE_API_KEY`, etc.) for user authentication to function.*
-
-Access the platform at `http://localhost:3000`.
-
-## 🧠 Using the Platform
 
-1.  **Authenticate**: Use the Firebase login page to sign in to the dashboard.
-2.  **Upload & Ingest**: Go to the New Proposal flow. Upload a financial PDF or Bureau data. The system uses Gemini Vision for intelligent OCR.
-3.  **Build Workflows**: Use the **Decision Studio** to visually drag and drop Risk Policies and Decision Nodes.
-4.  **Review the Output**:
-    - Observe the final decision (APPROVE / CONDITIONAL / REJECT).
-    - Review the Stress Test simulator and SHAP charts.
-    - Check the Governance Audit Trail.
-    - Click **"Generate CAM"** to receive the final professionally formatted Credit Appraisal Memo PDF.
-
-## 🛠 Tech Stack
+---
 
-- **Machine Learning & AI**: Scikit-Learn (Gradient Boosting), SHAP, HuggingFace (`ProsusAI/finbert`), Google Gemini API, LangChain, FAISS (Vector DB)
-- **Backend API**: Python 3.11, FastAPI, Uvicorn, PostgreSQL (with asyncpg & Alembic), ReportLab
-- **Frontend App**: Next.js (App Router), React 18, TailwindCSS, Recharts, React Flow (Nodes), Zustand (State Management), Firebase Auth
-- **Infra**: Context-driven REST APIs, Webhooks, Docker (Optional)
+## 📜 Documentation
+For a deeper dive into the system design, check out [architecture.md](file:///d:/Hackathons/intelli-credit/architecture.md).
diff --git a/backend/async_database.py b/backend/async_database.py
@@ -19,15 +19,21 @@
 
 ASYNC_DATABASE_URL = os.getenv(
     "ASYNC_DATABASE_URL",
-    "postgresql+asyncpg://postgres:postgres@localhost:5432/intelli_credit",
+    "sqlite+aiosqlite:///./intelli_credit_async.db",
 )
 
+# Detect if we should use SQLite (default or explicit)
+is_sqlite = ASYNC_DATABASE_URL.startswith("sqlite")
+
 async_engine = create_async_engine(
     ASYNC_DATABASE_URL,
     echo=False,
     future=True,
-    pool_size=10,
-    max_overflow=20,
+    # Pool arguments only for real DBs (Postgres)
+    **({
+        "pool_size": 10,
+        "max_overflow": 20,
+    } if not is_sqlite else {})
 )
 
 AsyncSessionLocal = async_sessionmaker(

diff --git a/backend/async_models.py b/backend/async_models.py
@@ -17,6 +17,9 @@
     Integer,
     String,
     Text,
+    Float,
+    Boolean,
+    JSON,
     UniqueConstraint,
 )
 from sqlalchemy.dialects.postgresql import JSONB, UUID as PG_UUID
@@ -210,3 +213,112 @@ class AuditLog(AsyncBase):
 
     def __repr__(self) -> str:
         return f"<AuditLog {self.action} on {self.entity_type}/{self.entity_id}>"
+
+
+class AnalysisSession(AsyncBase):
+    """Stores the state of a document extraction and risk analysis session."""
+    __tablename__ = "analysis_sessions"
+
+    id: Mapped[str] = mapped_column(String(128), primary_key=True)
+    tenant_id: Mapped[str] = mapped_column(String(128), index=True)
+    status: Mapped[str] = mapped_column(String(64), default="INITIATED")
+    raw_extracts: Mapped[dict] = mapped_column(JSON, default=dict)
+    features: Mapped[dict] = mapped_column(JSON, default=dict)
+    results: Mapped[dict] = mapped_column(JSON, default=dict)
+    created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=_utcnow)
+    updated_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=_utcnow, onupdate=_utcnow)
+
+
+class WorkflowDefinition(AsyncBase):
+    __tablename__ = "workflow_definitions"
+
+    id: Mapped[str] = mapped_column(String(128), primary_key=True)
+    name: Mapped[str] = mapped_column(String(255), default="Untitled Workflow")
+    status: Mapped[str] = mapped_column(String(32), default="draft")
+    definition_json: Mapped[dict] = mapped_column(JSON, default=dict)
+    created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=_utcnow)
+    updated_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=_utcnow, onupdate=_utcnow)
+
+    nodes: Mapped[list["WorkflowNodeDefinition"]] = relationship(back_populates="workflow", cascade="all, delete-orphan")
+    edges: Mapped[list["WorkflowEdgeDefinition"]] = relationship(back_populates="workflow", cascade="all, delete-orphan")
+
+
+class WorkflowNodeDefinition(AsyncBase):
+    __tablename__ = "workflow_node_definitions"
+
+    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
+    workflow_id: Mapped[str] = mapped_column(String(128), ForeignKey("workflow_definitions.id", ondelete="CASCADE"), index=True)
+    node_id: Mapped[str] = mapped_column(String(128))
+    node_type: Mapped[str] = mapped_column(String(64))
+    label: Mapped[str | None] = mapped_column(String(255))
+    position_x: Mapped[float] = mapped_column(Float, default=0)
+    position_y: Mapped[float] = mapped_column(Float, default=0)
+    config_json: Mapped[dict] = mapped_column(JSON, default=dict)
+    execution_config_json: Mapped[dict] = mapped_column(JSON, default=dict)
+
+    workflow: Mapped["WorkflowDefinition"] = relationship(back_populates="nodes")
+
+
+class WorkflowEdgeDefinition(AsyncBase):
+    __tablename__ = "workflow_edge_definitions"
+
+    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
+    workflow_id: Mapped[str] = mapped_column(String(128), ForeignKey("workflow_definitions.id", ondelete="CASCADE"), index=True)
+    edge_id: Mapped[str] = mapped_column(String(128))
+    source_node_id: Mapped[str] = mapped_column(String(128))
+    target_node_id: Mapped[str] = mapped_column(String(128))
+    source_handle: Mapped[str | None] = mapped_column(String(64))
+    target_handle: Mapped[str | None] = mapped_column(String(64))
+    edge_type: Mapped[str | None] = mapped_column(String(64))
+    config_json: Mapped[dict] = mapped_column(JSON, default=dict)
+
+    workflow: Mapped["WorkflowDefinition"] = relationship(back_populates="edges")
+
+
+class ExecutionRun(AsyncBase):
+    __tablename__ = "execution_runs"
+
+    id: Mapped[str] = mapped_column(String(128), primary_key=True)
+    workflow_id: Mapped[str | None] = mapped_column(String(128), ForeignKey("workflow_definitions.id", ondelete="SET NULL"), nullable=True)
+    status: Mapped[str] = mapped_column(String(32), default="queued")
+    initial_payload_json: Mapped[dict] = mapped_column(JSON, default=dict)
+    final_payload_json: Mapped[dict | None] = mapped_column(JSON, nullable=True)
+    error_message: Mapped[str | None] = mapped_column(Text)
+    tokens_consumed: Mapped[int] = mapped_column(Integer, default=0)
+    started_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True))
+    finished_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True))
+    duration_ms: Mapped[int | None] = mapped_column(Integer)
+    updated_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=_utcnow, onupdate=_utcnow)
+
+    workflow: Mapped["WorkflowDefinition | None"] = relationship()
+
+
+class NodeExecutionLog(AsyncBase):
+    __tablename__ = "node_execution_logs"
+
+    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
+    execution_id: Mapped[str] = mapped_column(String(128), ForeignKey("execution_runs.id", ondelete="CASCADE"), index=True)
+    workflow_id: Mapped[str | None] = mapped_column(String(128))
+    node_id: Mapped[str] = mapped_column(String(128))
+    node_type: Mapped[str] = mapped_column(String(64))
+    event_type: Mapped[str] = mapped_column(String(64))
+    status: Mapped[str] = mapped_column(String(32))
+    attempt: Mapped[int] = mapped_column(Integer, default=1)
+    input_payload_json: Mapped[dict | None] = mapped_column(JSON)
+    output_payload_json: Mapped[dict | None] = mapped_column(JSON)
+    source_edges_json: Mapped[list | None] = mapped_column(JSON)
+    error_message: Mapped[str | None] = mapped_column(Text)
+    started_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True))
+    finished_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True))
+    duration_ms: Mapped[int | None] = mapped_column(Integer)
+
+
+class DeadLetterExecution(AsyncBase):
+    __tablename__ = "dead_letter_executions"
+
+    id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True)
+    execution_id: Mapped[str] = mapped_column(String(128), ForeignKey("execution_runs.id", ondelete="CASCADE"), unique=True)
+    workflow_id: Mapped[str | None] = mapped_column(String(128))
+    failure_stage: Mapped[str] = mapped_column(String(64), default="workflow")
+    reason: Mapped[str] = mapped_column(Text)
+    payload_json: Mapped[dict | None] = mapped_column(JSON)