diff --git a/README.md b/README.md index a52da3d..f17cce0 100644 --- a/README.md +++ b/README.md @@ -1,41 +1,91 @@ -# Intelli-Credit - Intelligent Corporate Underwriting +# 🏦 Intelli-Credit - Intelligent Corporate Underwriting -> An autonomous AI Credit Officer designed to simulate how Tier-1 bank credit committees operate. +> An autonomous AI Credit Officer platform designed to automate Tier-1 bank credit committee operations, leveraging Multi-Agent architecture, specialized Financial LLMs, and explainable risk scoring. -Intelli-Credit is an end-to-end B2B credit decisioning platform. It ingests structured and unstructured borrower financial data, conducts autonomous web-scale due diligence, computes an explainable composite risk score using Machine Learning ensembles, simulates stress tests evaluating RAROC capital impact, and automatically generates a structured, downloadable Credit Appraisal Memo (CAM) in PDF format. +Intelli-Credit is an end-to-end B2B credit decisioning engine. It ingests complex financial documents, conducts autonomous due diligence via specialized agents, computes risk scores using machine learning ensembles, and generates professional Credit Appraisal Memos (CAM). -## πŸ”₯ Key Differentiators +--- -1. **"AI Credit Officer" Persona & LLM Integration**: The system isn't just a traditional ML pipeline. It uses **Google Gemini** to extract unstructured data, read financial PDFs, and automatically author conversational narrative summaries for risk and compliance. -2. **Web-Scale Research Simulation**: Includes NLP sentiment analysis (using FinBERT), regulatory filings intelligence, and ESG scores. -3. **Modular Decision Studio**: A dynamic workflow engine that allows credit risk managers to visually construct underwriting logic, configure dynamic scoring rules, and trigger external webhooks interactively. -4. **Capital Impact (RAROC) Simulation**: Elevates from basic scoring to bank portfolio management by assessing Risk-Weighted Assets (RWA) and tier capital requirements. -5. **SHAP-Based Explainability**: Avoids "black box" models. The top contributing risk drivers are extracted for every decision. +## πŸ— Project Architecture -## πŸ— System Architecture +Intelli-Credit is built with a modern, decoupled architecture designed for high performance and scalability. -The project consists of a Python FastAPI backend acting as the Machine Learning, LLM, and pipeline orchestration layer, paired with a modern Next.js frontend featuring real-time state synchronization, drag-and-drop workflow canvases, and Firebase authentication. +- **Frontend**: A highly interactive **Next.js 16** (App Router) application built with **React 19** and **Tailwind CSS 4**. It features a "Decision Studio" built on **XYFlow** for visual policy orchestration. +- **Backend**: A high-performance **FastAPI** service powered by **Python 3.11+**, utilizing **Async SQLAlchemy** for non-blocking database operations and **Google Gemini 1.5 Flash** for document intelligence. +- **AI Ecosystem**: Employs a multi-tier AI strategy using **Camel-AI** for multi-agent workflows, **Mem0** for persistent agent memory, and **XGBoost/SHAP** for transparent credit scoring. +- **Data Layer**: Integrates with **Databricks SQL Warehouse** for enterprise-grade data ingestion and **FAISS** for vector search capabilities. -### Core Modules -* **Ingestion Engine**: Parses financial PDFs, Bureau JSONs, and Bank Statement CSVs (using Gemini Vision & regex). -* **Dynamic Scorer & Rules Engine**: Evaluates nested risk rules built via the Decision Studio UI. -* **LLM Research Agent**: Leverages a LangChain-powered agent to perform RAG-based Vector Search and external intelligence aggregation. -* **Risk Synthesis**: Combines ML Probability of Default (Gradient Boosting), qualitative LLM summaries, and macro-economic factors. +--- -## πŸš€ Quick Start (Local Development) +## πŸ“‚ Project Structure + +```bash +intelli-credit/ +β”œβ”€β”€ frontend/ # Next.js 16 + React 19 Frontend +β”‚ β”œβ”€β”€ src/ +β”‚ β”‚ β”œβ”€β”€ app/ # App Router pages and layouts +β”‚ β”‚ β”œβ”€β”€ components/ # Reusable UI components (Tailwind 4) +β”‚ β”‚ β”œβ”€β”€ store/ # Zustand state management +β”‚ β”‚ └── hooks/ # Custom React hooks +β”œβ”€β”€ backend/ # FastAPI + Python Backend +β”‚ β”œβ”€β”€ modules/ # Core logic: Ingestion, Scoring, Agents +β”‚ β”œβ”€β”€ routers/ # API endpoints (V2 Async supported) +β”‚ β”œβ”€β”€ schemas/ # Pydantic data models +β”‚ β”œβ”€β”€ database/ # SQLAlchemy models and migrations +β”‚ β”œβ”€β”€ security/ # Firebase Auth integration +β”‚ └── training/ # ML model training scripts +β”œβ”€β”€ docker-compose.yml # Container orchestration +└── architecture.md # Technical Deep-Dive +``` + +--- + +## πŸ”₯ Key Features + +### 1. 🧠 Intelligent Ingestion Engine +Uses **Gemini 1.5 Flash** & **OCR (Tesseract/pdfplumber)** to extract structured financial data from messy, scanned Indian corporate PDFs, including: +- Schedule III Balance Sheets & Profit/Loss statements. +- GST Filings (GSTR-1, 3B) linked to **Databricks**. +- Bank Statements with automated transaction categorization. + +### 2. 🎨 Decision Studio (Visual Policy Engine) +A drag-and-drop canvas powered by **XYFlow** that allows risk managers to: +- Build nested credit policies without writing code. +- Trigger external webhooks and data integrations. +- Define dynamic rules using a secure Python AST execution engine. + +### 3. πŸ€– Multi-Agent Due Diligence +Deploys a swarm of autonomous agents using **Camel-AI** and **Mem0**: +- **Searcher Agent**: Conducts web-scale adverse media and regulatory searches. +- **Analyst Agent**: Synthesizes financial ratios and macro-economic factors. +- **Summarizer Agent**: Authors high-quality narrative commentary for the CAM. + +### 4. πŸ“Š Explainable Risk Scoring (SHAP) +Avoids "black box" decisions by providing full transparency: +- **XGBoost Ensembles**: Predicts probability of default with high accuracy. +- **SHAP Interpretability**: Visualizes exactly which factors (e.g., DSCR, Current Ratio) drove the final decision. +- **RAROC Simulation**: Estimates Risk-Adjusted Return on Capital and capital impact. + +--- + +## πŸ›  Tech Stack + +- **Frontend**: Next.js 16, React 19, Tailwind CSS 4, XYFlow, Zustand, Recharts, Framer Motion. +- **Backend**: FastAPI, SQLAlchemy 2.0, Pydantic, ReportLab, Celery (Optional). +- **AI/ML**: Google Gemini 1.5 Flash, XGBoost, SHAP, Camel-AI, Mem0, FinBERT (Sentiment). +- **Data & Auth**: PostgreSQL/SQLite, Databricks, Firebase Auth (Identity Platform). + +--- + +## πŸš€ Quick Start ### 1. Backend Setup ```bash cd backend python -m venv venv -venv\Scripts\activate # On Windows +source venv/bin/activate # venv\Scripts\activate on Windows pip install -r requirements.txt -``` -*Note: A `.env` file is required in the backend containing your `GEMINI_API_KEY`, `POSTGRES_USER`, and database strings for Alembic migrations.* - -Run the FastAPI server: -```bash -uvicorn main:app --reload --port 8000 +uvicorn main:app --reload ``` ### 2. Frontend Setup @@ -44,24 +94,8 @@ cd frontend npm install npm run dev ``` -*Note: Ensure your `.env.local` contains valid Firebase configuration keys (`NEXT_PUBLIC_FIREBASE_API_KEY`, etc.) for user authentication to function.* - -Access the platform at `http://localhost:3000`. - -## 🧠 Using the Platform -1. **Authenticate**: Use the Firebase login page to sign in to the dashboard. -2. **Upload & Ingest**: Go to the New Proposal flow. Upload a financial PDF or Bureau data. The system uses Gemini Vision for intelligent OCR. -3. **Build Workflows**: Use the **Decision Studio** to visually drag and drop Risk Policies and Decision Nodes. -4. **Review the Output**: - - Observe the final decision (APPROVE / CONDITIONAL / REJECT). - - Review the Stress Test simulator and SHAP charts. - - Check the Governance Audit Trail. - - Click **"Generate CAM"** to receive the final professionally formatted Credit Appraisal Memo PDF. - -## πŸ›  Tech Stack +--- -- **Machine Learning & AI**: Scikit-Learn (Gradient Boosting), SHAP, HuggingFace (`ProsusAI/finbert`), Google Gemini API, LangChain, FAISS (Vector DB) -- **Backend API**: Python 3.11, FastAPI, Uvicorn, PostgreSQL (with asyncpg & Alembic), ReportLab -- **Frontend App**: Next.js (App Router), React 18, TailwindCSS, Recharts, React Flow (Nodes), Zustand (State Management), Firebase Auth -- **Infra**: Context-driven REST APIs, Webhooks, Docker (Optional) +## πŸ“œ Documentation +For a deeper dive into the system design, check out [architecture.md](file:///d:/Hackathons/intelli-credit/architecture.md). diff --git a/backend/async_database.py b/backend/async_database.py index 446b8fd..841d426 100644 --- a/backend/async_database.py +++ b/backend/async_database.py @@ -19,15 +19,21 @@ ASYNC_DATABASE_URL = os.getenv( "ASYNC_DATABASE_URL", - "postgresql+asyncpg://postgres:postgres@localhost:5432/intelli_credit", + "sqlite+aiosqlite:///./intelli_credit_async.db", ) +# Detect if we should use SQLite (default or explicit) +is_sqlite = ASYNC_DATABASE_URL.startswith("sqlite") + async_engine = create_async_engine( ASYNC_DATABASE_URL, echo=False, future=True, - pool_size=10, - max_overflow=20, + # Pool arguments only for real DBs (Postgres) + **({ + "pool_size": 10, + "max_overflow": 20, + } if not is_sqlite else {}) ) AsyncSessionLocal = async_sessionmaker( diff --git a/backend/async_models.py b/backend/async_models.py index 4216ea6..469e071 100644 --- a/backend/async_models.py +++ b/backend/async_models.py @@ -17,6 +17,9 @@ Integer, String, Text, + Float, + Boolean, + JSON, UniqueConstraint, ) from sqlalchemy.dialects.postgresql import JSONB, UUID as PG_UUID @@ -210,3 +213,112 @@ class AuditLog(AsyncBase): def __repr__(self) -> str: return f"" + + +class AnalysisSession(AsyncBase): + """Stores the state of a document extraction and risk analysis session.""" + __tablename__ = "analysis_sessions" + + id: Mapped[str] = mapped_column(String(128), primary_key=True) + tenant_id: Mapped[str] = mapped_column(String(128), index=True) + status: Mapped[str] = mapped_column(String(64), default="INITIATED") + raw_extracts: Mapped[dict] = mapped_column(JSON, default=dict) + features: Mapped[dict] = mapped_column(JSON, default=dict) + results: Mapped[dict] = mapped_column(JSON, default=dict) + created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=_utcnow) + updated_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=_utcnow, onupdate=_utcnow) + + +class WorkflowDefinition(AsyncBase): + __tablename__ = "workflow_definitions" + + id: Mapped[str] = mapped_column(String(128), primary_key=True) + name: Mapped[str] = mapped_column(String(255), default="Untitled Workflow") + status: Mapped[str] = mapped_column(String(32), default="draft") + definition_json: Mapped[dict] = mapped_column(JSON, default=dict) + created_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=_utcnow) + updated_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=_utcnow, onupdate=_utcnow) + + nodes: Mapped[list["WorkflowNodeDefinition"]] = relationship(back_populates="workflow", cascade="all, delete-orphan") + edges: Mapped[list["WorkflowEdgeDefinition"]] = relationship(back_populates="workflow", cascade="all, delete-orphan") + + +class WorkflowNodeDefinition(AsyncBase): + __tablename__ = "workflow_node_definitions" + + id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True) + workflow_id: Mapped[str] = mapped_column(String(128), ForeignKey("workflow_definitions.id", ondelete="CASCADE"), index=True) + node_id: Mapped[str] = mapped_column(String(128)) + node_type: Mapped[str] = mapped_column(String(64)) + label: Mapped[str | None] = mapped_column(String(255)) + position_x: Mapped[float] = mapped_column(Float, default=0) + position_y: Mapped[float] = mapped_column(Float, default=0) + config_json: Mapped[dict] = mapped_column(JSON, default=dict) + execution_config_json: Mapped[dict] = mapped_column(JSON, default=dict) + + workflow: Mapped["WorkflowDefinition"] = relationship(back_populates="nodes") + + +class WorkflowEdgeDefinition(AsyncBase): + __tablename__ = "workflow_edge_definitions" + + id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True) + workflow_id: Mapped[str] = mapped_column(String(128), ForeignKey("workflow_definitions.id", ondelete="CASCADE"), index=True) + edge_id: Mapped[str] = mapped_column(String(128)) + source_node_id: Mapped[str] = mapped_column(String(128)) + target_node_id: Mapped[str] = mapped_column(String(128)) + source_handle: Mapped[str | None] = mapped_column(String(64)) + target_handle: Mapped[str | None] = mapped_column(String(64)) + edge_type: Mapped[str | None] = mapped_column(String(64)) + config_json: Mapped[dict] = mapped_column(JSON, default=dict) + + workflow: Mapped["WorkflowDefinition"] = relationship(back_populates="edges") + + +class ExecutionRun(AsyncBase): + __tablename__ = "execution_runs" + + id: Mapped[str] = mapped_column(String(128), primary_key=True) + workflow_id: Mapped[str | None] = mapped_column(String(128), ForeignKey("workflow_definitions.id", ondelete="SET NULL"), nullable=True) + status: Mapped[str] = mapped_column(String(32), default="queued") + initial_payload_json: Mapped[dict] = mapped_column(JSON, default=dict) + final_payload_json: Mapped[dict | None] = mapped_column(JSON, nullable=True) + error_message: Mapped[str | None] = mapped_column(Text) + tokens_consumed: Mapped[int] = mapped_column(Integer, default=0) + started_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True)) + finished_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True)) + duration_ms: Mapped[int | None] = mapped_column(Integer) + updated_at: Mapped[datetime] = mapped_column(DateTime(timezone=True), default=_utcnow, onupdate=_utcnow) + + workflow: Mapped["WorkflowDefinition | None"] = relationship() + + +class NodeExecutionLog(AsyncBase): + __tablename__ = "node_execution_logs" + + id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True) + execution_id: Mapped[str] = mapped_column(String(128), ForeignKey("execution_runs.id", ondelete="CASCADE"), index=True) + workflow_id: Mapped[str | None] = mapped_column(String(128)) + node_id: Mapped[str] = mapped_column(String(128)) + node_type: Mapped[str] = mapped_column(String(64)) + event_type: Mapped[str] = mapped_column(String(64)) + status: Mapped[str] = mapped_column(String(32)) + attempt: Mapped[int] = mapped_column(Integer, default=1) + input_payload_json: Mapped[dict | None] = mapped_column(JSON) + output_payload_json: Mapped[dict | None] = mapped_column(JSON) + source_edges_json: Mapped[list | None] = mapped_column(JSON) + error_message: Mapped[str | None] = mapped_column(Text) + started_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True)) + finished_at: Mapped[datetime | None] = mapped_column(DateTime(timezone=True)) + duration_ms: Mapped[int | None] = mapped_column(Integer) + + +class DeadLetterExecution(AsyncBase): + __tablename__ = "dead_letter_executions" + + id: Mapped[int] = mapped_column(Integer, primary_key=True, autoincrement=True) + execution_id: Mapped[str] = mapped_column(String(128), ForeignKey("execution_runs.id", ondelete="CASCADE"), unique=True) + workflow_id: Mapped[str | None] = mapped_column(String(128)) + failure_stage: Mapped[str] = mapped_column(String(64), default="workflow") + reason: Mapped[str] = mapped_column(Text) + payload_json: Mapped[dict | None] = mapped_column(JSON) diff --git a/backend/modules/ingestion.py b/backend/modules/ingestion.py index c2ed2b5..df4472b 100644 --- a/backend/modules/ingestion.py +++ b/backend/modules/ingestion.py @@ -93,54 +93,70 @@ "unknown", } +OCR_NUMERIC_CHARS = r"0-9SOlI,.\-" + FIELD_PATTERNS = { "revenue": [ - r"(?:total\s+)?revenue[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", - r"(?:net\s+)?sales[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", - r"turnover[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", - r"income\s+from\s+operations[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", + rf"(?:total\s+)?revenue[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", + rf"(?:net\s+)?sales[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", + rf"turnover[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", + rf"income\s+from\s+operations[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", ], "net_income": [ - r"net\s+(?:income|profit)[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", - r"profit\s+after\s+tax[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", - r"\bpat[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", + rf"net\s+(?:income|profit)[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", + rf"profit\s+after\s+tax[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", + rf"\bpat[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", ], - "total_assets": [r"total\s+assets[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)"], - "total_liabilities": [r"total\s+liabilities[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)"], + "total_assets": [rf"total\s+assets[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)"], + "total_liabilities": [rf"total\s+liabilities[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)"], "total_equity": [ - r"(?:total\s+)?(?:shareholders?\s+)?equity[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", - r"net\s+worth[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", + rf"(?:total\s+)?(?:shareholders?\s+)?equity[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", + rf"net\s+worth[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", ], "ebitda": [ - r"\bebitda[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", - r"earnings\s+before\s+interest[^:]*[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", + rf"\bebitda[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", + rf"earnings\s+before\s+interest[^:]*[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", ], "total_debt": [ - r"total\s+(?:borrowings?|debt)[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", - r"long[\s-]term\s+(?:debt|borrowings?)[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", + rf"total\s+(?:borrowings?|debt)[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", + rf"long[\s-]term\s+(?:debt|borrowings?)[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", ], "cash_and_equivalents": [ - r"cash\s+(?:and\s+)?(?:cash\s+)?equivalents?[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", - r"cash\s+(?:and|&)\s+bank[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", + rf"cash\s+(?:and\s+)?(?:cash\s+)?equivalents?[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", + rf"cash\s+(?:and|&)\s+bank[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", ], "operating_cash_flow": [ - r"(?:operating|operational)\s+cash\s+flow[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", - r"cash\s+from\s+operations[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", + rf"(?:operating|operational)\s+cash\s+flow[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", + rf"cash\s+from\s+operations[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", ], - "depreciation": [r"depreciation[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)"], + "depreciation": [rf"depreciation[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)"], "interest_expense": [ - r"interest\s+(?:expense|cost)[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", - r"finance\s+cost[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", + rf"interest\s+(?:expense|cost)[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", + rf"finance\s+cost[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", ], - "tax_expense": [r"tax\s+expense[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)"], - "current_assets": [r"(?:total\s+)?current\s+assets[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)"], - "current_liabilities": [r"(?:total\s+)?current\s+liabilities[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)"], - "accounts_receivable": [r"accounts\s+receivable[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)"], - "inventory": [r"inventory[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)"], + "tax_expense": [rf"tax\s+expense[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)"], + "current_assets": [rf"(?:total\s+)?current\s+assets[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)"], + "current_liabilities": [rf"(?:total\s+)?current\s+liabilities[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)"], + "accounts_receivable": [rf"accounts\s+receivable[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)"], + "inventory": [rf"inventory[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)"], } + +def _ocr_fix(match: re.Match) -> str: + """Callback for re.sub to fix common OCR misreads within numeric context.""" + s = match.group(0) + return ( + s.replace('S', '5') + .replace('s', '5') + .replace('O', '0') + .replace('o', '0') + .replace('l', '1') + .replace('I', '1') + ) + + def _safe_float(value: Any, default: float = 0.0) -> float: if value is None or value == "": return default @@ -148,23 +164,35 @@ def _safe_float(value: Any, default: float = 0.0) -> float: if pd.isna(value): return default return float(value) + + # Handle common OCR misreads in numeric context cleaned = str(value).strip() if not cleaned: return default - negative = cleaned.startswith("(") and cleaned.endswith(")") + + # Replace common OCR errors if the surrounding context contains any digits + # OR if it looks like a misread currency/number block + if re.search(r'\d|[SOlI]', cleaned): + cleaned = re.sub(rf'[{OCR_NUMERIC_CHARS}]+', _ocr_fix, cleaned) + + negative = (cleaned.startswith("(") and cleaned.endswith(")")) or cleaned.startswith("-") cleaned = cleaned.replace(",", "") - cleaned = cleaned.replace("Rs.", "").replace("Rs", "").replace("INR", "") + cleaned = re.sub(r'(?i)Rs\.?|INR|/-', '', cleaned) cleaned = cleaned.replace("%", "") - cleaned = re.sub(r"[^0-9.\-]", "", cleaned) - if cleaned in {"", "-", ".", "-."}: + + # Extract only the first valid numeric part + numeric_match = re.search(r'[-+]?\d*\.?\d+', cleaned) + if not numeric_match: return default + try: - result = float(cleaned) - except ValueError: + result = float(numeric_match.group(0)) + except (ValueError, OverflowError): return default return -abs(result) if negative else result + def _safe_int(value: Any, default: int = 0) -> int: return int(round(_safe_float(value, default))) @@ -222,6 +250,34 @@ def _coerce_sanction_terms(raw_terms: Any) -> Dict[str, Any]: } +def _verify_financial_logic(extracted: Dict[str, Any]) -> List[str]: + """Verify basic accounting identities to detect extraction errors or fraud.""" + warnings = [] + + revenue = extracted.get("revenue") + ebitda = extracted.get("ebitda") + net_income = extracted.get("net_income") + + if revenue is not None and ebitda is not None: + if ebitda > revenue: + warnings.append("EBITDA exceeds Revenue - likely extraction error.") + + if net_income is not None and ebitda is not None: + if net_income > ebitda: + warnings.append("Net Income exceeds EBITDA - check for non-operating income or extraction error.") + + assets = extracted.get("total_assets") + liabilities = extracted.get("total_liabilities") + equity = extracted.get("total_equity") + + if all(v is not None for v in [assets, liabilities, equity]): + variance = abs(assets - (liabilities + equity)) + if variance > (assets * 0.05) and assets > 0: + warnings.append(f"Balance Sheet mismatch: Assets don't equal Liab+Equity (Variance: {variance:,.0f}).") + + return warnings + + def _validate_financial_extraction( payload: Mapping[str, Any], *, @@ -321,30 +377,41 @@ def _request_gemini_json( def _normalize_indian_financials(text: str) -> str: - """Regex-based utility to convert Indian 'Cr' and 'Lakhs' into clean standard numbers.""" + """Regex-based utility to convert Indian 'Cr' and 'Lakhs' into clean standard numbers. + Hardened to handle common OCR misreads like 'Gr' for 'Cr' or 'Lacs'. + """ if not text: return text + # Handle Crores (Cr, Crore, Crores, Gr, 0r) def replace_cr(match): try: - val = float(match.group(1).replace(",", "")) - return str(int(val * 10000000)) - except ValueError: + val_str = re.sub(rf'[{OCR_NUMERIC_CHARS}]+', _ocr_fix, match.group(1)).replace(",", "") + val = float(val_str) + return f" {int(val * 10000000)} " + except (ValueError, OverflowError): return match.group(0) - text = re.sub(r"([\d,]+(?:\.\d+)?)\s*(?:Cr|Crores?)", replace_cr, text, flags=re.IGNORECASE) + # Pattern for Crores: Capture number followed by Cr variants + cr_pattern = rf"([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)\s*(?:Cr|Crores?|[G0]r|Crs)\b" + text = re.sub(cr_pattern, replace_cr, text, flags=re.IGNORECASE) + # Handle Lakhs (L, Lakh, Lakhs, Lac, Lacs) def replace_lakh(match): try: - val = float(match.group(1).replace(",", "")) - return str(int(val * 100000)) - except ValueError: + val_str = re.sub(rf'[{OCR_NUMERIC_CHARS}]+', _ocr_fix, match.group(1)).replace(",", "") + val = float(val_str) + return f" {int(val * 100000)} " + except (ValueError, OverflowError): return match.group(0) - text = re.sub(r"([\d,]+(?:\.\d+)?)\s*(?:Lakhs?|Lacs?)", replace_lakh, text, flags=re.IGNORECASE) + lakh_pattern = rf"([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)\s*(?:L|Lakhs?|Lacs?)\b" + text = re.sub(lakh_pattern, replace_lakh, text, flags=re.IGNORECASE) + return text + def _extract_text_from_pdf(file_bytes: bytes) -> str: text_chunks: List[str] = [] total_chars = 0 @@ -366,8 +433,18 @@ def _extract_text_from_pdf(file_bytes: bytes) -> str: total_chars += len(page_text) # Fallback trigger: If character count extracted per page is abnormally low (indicating scanned image) - if num_pages > 0 and (total_chars / num_pages) < 100: - print("Scanned document detected (low char count). Engaging hybrid Tesseract OCR fallback...") + avg_chars = total_chars / num_pages if num_pages > 0 else 0 + is_scanned = avg_chars < 250 + + # Additional gibberish check: high ratio of non-alphanumeric chars in text-layer + if not is_scanned and total_chars > 200: + full_text_sample = "\n".join(text_chunks) + alphanumeric_count = len(re.findall(r'[A-Za-z0-9]', full_text_sample)) + if (alphanumeric_count / total_chars) < 0.4: + is_scanned = True + + if is_scanned: + print(f"Scanned or noisy document detected (avg {avg_chars:.1f} chars/pg). Engaging Tesseract...") text_chunks = [] try: images = convert_from_bytes(file_bytes) @@ -410,10 +487,11 @@ def _extract_fields_from_text(text: str) -> Dict[str, Any]: extracted[field] = _safe_float(match.group(1), None) break - limit_match = re.search(r"(?:limit|facility)\s*(?:of)?\s*(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", cleaned_text, re.IGNORECASE) - interest_match = re.search(r"(?:interest\s+rate|roi)[:\s]+([\d.]+)", cleaned_text, re.IGNORECASE) - tenor_match = re.search(r"(?:tenor|repayment\s+period)[:\s]+([\d]+)", cleaned_text, re.IGNORECASE) - installment_match = re.search(r"(?:emi|installment)[:\s]+(?:rs\.?|inr)?\s*([\d,]+(?:\.\d+)?)", cleaned_text, re.IGNORECASE) + limit_match = re.search(rf"(?:limit|facility)\s*(?:of)?\s*(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", cleaned_text, re.IGNORECASE) + interest_match = re.search(rf"(?:interest\s+rate|roi)[:\s]+([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", cleaned_text, re.IGNORECASE) + tenor_match = re.search(rf"(?:tenor|repayment\s+period)[:\s]+([{OCR_NUMERIC_CHARS}]+)", cleaned_text, re.IGNORECASE) + installment_match = re.search(rf"(?:emi|installment)[:\s]+(?:rs\.?|inr)?\s*([{OCR_NUMERIC_CHARS}]+(?:\.[{OCR_NUMERIC_CHARS}]+)?)", cleaned_text, re.IGNORECASE) + lines = [line.strip() for line in cleaned_text.splitlines() if line.strip()] extracted["document_type"] = _detect_document_type(cleaned_text) @@ -757,6 +835,11 @@ def parse_financial_pdf(file_bytes: bytes) -> Dict[str, Any]: merged.setdefault("extraction_warnings", []).append("revenue_not_found") if not merged.get("sanction_terms", {}).get("amortization_schedule_available"): merged.setdefault("extraction_warnings", []).append("amortization_schedule_missing") + + # Final data integrity verification + logic_warnings = _verify_financial_logic(merged) + merged.setdefault("extraction_warnings", []).extend(logic_warnings) + return merged @@ -856,12 +939,12 @@ def _parse_statement_dataframe(df: pd.DataFrame) -> Dict[str, Any]: working = df.copy() working.columns = [str(column).strip() for column in working.columns] - date_col = _find_column(working.columns, ["date", "txn date", "transaction date", "value date"]) - desc_col = _find_column(working.columns, ["description", "narration", "remarks", "particular", "details"]) - credit_col = _find_column(working.columns, ["credit", "deposit", "cr amount"]) - debit_col = _find_column(working.columns, ["debit", "withdrawal", "dr amount"]) - amount_col = _find_column(working.columns, ["amount", "transaction amount"]) - balance_col = _find_column(working.columns, ["balance", "closing bal", "available balance"]) + date_col = _find_column(working.columns, ["date", "txn date", "transaction date", "value date", "entry date"]) + desc_col = _find_column(working.columns, ["description", "narration", "remarks", "particular", "details", "transaction details"]) + credit_col = _find_column(working.columns, ["credit", "deposit", "cr amount", "inward", "receipts", "amount(cr)"]) + debit_col = _find_column(working.columns, ["debit", "withdrawal", "dr amount", "outward", "payments", "amount(dr)"]) + amount_col = _find_column(working.columns, ["amount", "transaction amount", "txn amount", "net amount"]) + balance_col = _find_column(working.columns, ["balance", "closing bal", "available balance", "bal amt", "running balance"]) if not any([credit_col, debit_col, amount_col]): raise ValueError("Bank statement CSV does not contain amount columns") diff --git a/backend/requirements.txt b/backend/requirements.txt index c7fe536..bb2f2fd 100644 --- a/backend/requirements.txt +++ b/backend/requirements.txt @@ -1,4 +1,4 @@ -ο»Ώfastapi +fastapi uvicorn python-multipart pdfplumber @@ -33,3 +33,12 @@ faiss-cpu>=1.8.0 cachetools>=5.3.3 asyncpg>=0.29.0 alembic>=1.13.0 +pydantic-settings>=2.0.0 +camel-ai[all] +networkx +mem0ai[graph] +python-docx +lxml +simpleeval +groq + diff --git a/backend/routers/analyze.py b/backend/routers/analyze.py index e84324f..fe459ec 100644 --- a/backend/routers/analyze.py +++ b/backend/routers/analyze.py @@ -91,6 +91,9 @@ class CustomerDetails(BaseModel): id: str = "" industry: str = "Manufacturing" constitution: str = "" + pan: str = "" + cin: str = "" + gstin: str = "" class FinancialDetails(BaseModel): @@ -196,8 +199,8 @@ async def upload_document( ): """Upload and parse a document, returning a temporary analysis_id.""" filename = getattr(file, "filename", "") or "" - if not filename.lower().endswith((".pdf", ".csv")): - raise HTTPException(status_code=400, detail={"error": "Unsupported file type. Strictly .pdf and .csv are allowed."}) + if not filename.lower().endswith((".pdf", ".csv", ".json")): + raise HTTPException(status_code=400, detail={"error": "Unsupported file type. Strictly .pdf, .csv, and .json are allowed."}) content = await file.read() if len(content) > 50 * 1024 * 1024: @@ -269,14 +272,12 @@ async def run_full_analysis( # REAL EXTERNAL API INTEGRATION # ---------------------------------------------------- aggregator = ExternalDataAggregator() - # In a real app, gstin/cin/pan would be populated from the LOS/request payload. - # Using dummy/placeholder identifiers if not provided by the frontend. ext_data = await aggregator.aggregate_borrower_facts( company_name=req.customer.name, company_id=req.customer.id, - gstin=f"27{req.customer.id}1Z5"[:15], # Fake GSTIN based on ID for demo - cin=f"U74999MH2023PTC{req.customer.id}"[:21], # Fake CIN based on ID - pan=f"ABCDE{req.customer.id}F"[:10] # Fake PAN based on ID + gstin=req.customer.gstin, + cin=req.customer.cin, + pan=req.customer.pan ) # Merge the rigorous unified BorrowerFact into the feature set for the decision engine diff --git a/backend/routers/portfolio.py b/backend/routers/portfolio.py index 16f2494..cda1c66 100644 --- a/backend/routers/portfolio.py +++ b/backend/routers/portfolio.py @@ -24,12 +24,17 @@ async def initialize_search_engine(): # Note: In a real clustered production environment, you'd trigger this # via a dedicated worker or message queue to avoid slowing down API startup. # For local/demo, we initialize here. - db_gen = get_db() - db = next(db_gen) - try: - await search_engine_instance.synchronize_index(db) - finally: - db.close() + import asyncio + + async def run_sync(): + db_gen = get_db() + db = next(db_gen) + try: + await search_engine_instance.synchronize_index(db) + finally: + db.close() + + asyncio.create_task(run_sync()) @router.get("/portfolio/search", response_model=List[SearchResult]) async def search_portfolio( diff --git a/backend/routers/studio.py b/backend/routers/studio.py index fe3b6bf..1a88cf6 100644 --- a/backend/routers/studio.py +++ b/backend/routers/studio.py @@ -1,4 +1,4 @@ -ο»Ώfrom __future__ import annotations +from __future__ import annotations import asyncio from datetime import datetime, timezone @@ -11,27 +11,17 @@ from services.event_bus import TERMINAL_EVENT_TYPES, execution_event_broker from services.workflow_engine import WorkflowEngine, WorkflowEngineError -try: - from database import SessionLocal - from db_models import ( - DeadLetterExecution, - ExecutionRun, - NodeExecutionLog, - WorkflowDefinition, - WorkflowEdgeDefinition, - WorkflowNodeDefinition, - ) - - DB_AVAILABLE = True -except ImportError: - SessionLocal = None - WorkflowDefinition = None - WorkflowNodeDefinition = None - WorkflowEdgeDefinition = None - ExecutionRun = None - NodeExecutionLog = None - DeadLetterExecution = None - DB_AVAILABLE = False +from async_database import AsyncSessionLocal as async_session_maker +from async_models import ( + DeadLetterExecution, + ExecutionRun, + NodeExecutionLog, + WorkflowDefinition, + WorkflowEdgeDefinition, + WorkflowNodeDefinition, +) + +DB_AVAILABLE = True router = APIRouter() diff --git a/backend/services/execution_engine.py b/backend/services/execution_engine.py index a75d679..5b3fbe7 100644 --- a/backend/services/execution_engine.py +++ b/backend/services/execution_engine.py @@ -146,6 +146,36 @@ async def process_condition_node(node: Dict[str, Any], context: ExecutionContext expression = node.get("data", {}).get("expression", "True") + import re + def replacer(match): + path = match.group(1).strip() + + # Determine the root dictionary to search + if path.startswith("nodes."): + val = local_vars.get("context", {}) + path_keys = path[6:].split(".") + elif path.startswith("input."): + val = local_vars.get("payload", {}) + path_keys = path[6:].split(".") + else: + return match.group(0) + + # Traverse the path + for key in path_keys: + if isinstance(val, dict): + val = val.get(key) + else: + return "None" + + # Format the extracted value for simple_eval + if isinstance(val, str): + return f"'{val}'" + return str(val) if val is not None else "None" + + # Replace all {{ path }} occurrences + expression = re.sub(r'\{\{(.*?)\}\}', replacer, expression) + + try: # Secure AST AST parsing utilizing simpleeval # Evaluates safely strictly without access to arbitrary imports (no RCE) diff --git a/backend/services/search_engine.py b/backend/services/search_engine.py index edd27d4..5a5664d 100644 --- a/backend/services/search_engine.py +++ b/backend/services/search_engine.py @@ -30,17 +30,13 @@ def __init__(self): return # 1. Semantic Search Components (Dense Vectors) - # Using a lightweight, fast sentence transformer model - if SentenceTransformer: - self.embedder = SentenceTransformer('all-MiniLM-L6-v2') - self.embedding_dim = self.embedder.get_sentence_embedding_dimension() - # FAISS Index for Inner Product (Cosine Similarity if vectors are normalized) - self.index = faiss.IndexFlatIP(self.embedding_dim) + self.embedder = None + self.embedding_dim = None + self.index = None # 2. Keyword Search Components (Sparse Vectors) - if TfidfVectorizer: - self.tfidf = TfidfVectorizer(stop_words='english', lowercase=True) - self.tfidf_matrix = None + self.tfidf = None + self.tfidf_matrix = None # 3. ID Mapping self.record_ids: List[str] = [] @@ -55,6 +51,14 @@ async def synchronize_index(self, db: Session): print("Warning: Search dependencies missing. Engine won't initialize.") return + if not self.embedder and SentenceTransformer: + self.embedder = SentenceTransformer('all-MiniLM-L6-v2') + self.embedding_dim = self.embedder.get_sentence_embedding_dimension() + self.index = faiss.IndexFlatIP(self.embedding_dim) + + if not self.tfidf and TfidfVectorizer: + self.tfidf = TfidfVectorizer(stop_words='english', lowercase=True) + records = db.query(CreditRecord).all() if not records: self.is_ready = True @@ -85,15 +89,15 @@ async def synchronize_index(self, db: Session): } # Build Semantic Index - # Run blocking embedding generation in threadpool - embeddings = await asyncio.to_thread(self.embedder.encode, texts, convert_to_numpy=True) - # Normalize vectors for cosine similarity in FAISS - faiss.normalize_L2(embeddings) - self.index.reset() - self.index.add(embeddings) + if self.embedder: + embeddings = await asyncio.to_thread(self.embedder.encode, texts, convert_to_numpy=True) + faiss.normalize_L2(embeddings) + self.index.reset() + self.index.add(embeddings) # Build Keyword Index - self.tfidf_matrix = await asyncio.to_thread(self.tfidf.fit_transform, texts) + if self.tfidf: + self.tfidf_matrix = await asyncio.to_thread(self.tfidf.fit_transform, texts) self.is_ready = True print(f"Hybrid Search Engine ready with {len(self.record_ids)} records.") diff --git a/backend/test_ocr_fixes.py b/backend/test_ocr_fixes.py new file mode 100644 index 0000000..7ab53f9 --- /dev/null +++ b/backend/test_ocr_fixes.py @@ -0,0 +1,52 @@ +import sys +import os +import re + +# Add the project root to sys.path +sys.path.append(os.path.dirname(os.path.abspath(__file__))) + +try: + import numpy as np +except ImportError: + np = None +try: + import pandas as pd +except ImportError: + pd = None + +from modules.ingestion import _safe_float, _normalize_indian_financials + +def test_ocr_fixes(): + print("Testing _safe_float with OCR errors...") + test_cases = [ + ("5O,00", 5000.0), + ("S,l23", 5123.0), + ("(IOO)", -100.0), + ("RS. l,2SO.OO", 1250.0), + ("INR S00/-", 500.0), + ("l.2S %", 1.25), + ("O.OO", 0.0), + ("SOO", 500.0), # Entirely misread + ] + + for input_val, expected in test_cases: + result = _safe_float(input_val) + status = "PASS" if result == expected else f"FAIL (got {result})" + print(f" '{input_val}' -> {result} | {status}") + + print("\nTesting _normalize_indian_financials with OCR errors...") + unit_test_cases = [ + ("Revenue: l.2S Cr", "Revenue: 12500000"), + ("Profit: SO Lakhs", "Profit: 5000000"), + ("Limit: lO Cr", "Limit: 100000000"), + ("Net Worth: S.S Cr", "Net Worth: 55000000"), # Changed Gr to Cr for simpler check or just test Cr + ("Net Worth: S.S Gr", "Net Worth: 55000000"), # Testing Gr as Cr error + ] + + for input_val, expected in unit_test_cases: + result = _normalize_indian_financials(input_val).strip() + status = "PASS" if result == expected else f"FAIL (got {result})" + print(f" '{input_val}' -> '{result}' | {status}") + +if __name__ == "__main__": + test_ocr_fixes() diff --git a/backend/test_output.txt b/backend/test_output.txt new file mode 100644 index 0000000..1812c21 Binary files /dev/null and b/backend/test_output.txt differ diff --git a/frontend/src/app/page.js b/frontend/src/app/page.js index 02b6fda..720d1fc 100644 --- a/frontend/src/app/page.js +++ b/frontend/src/app/page.js @@ -55,7 +55,7 @@ export default function Workspace() { // --- DROPZONE FOR UPLOAD --- const { getRootProps, getInputProps, isDragActive } = useDropzone({ - accept: { 'application/pdf': ['.pdf'], 'text/csv': ['.csv'] }, + accept: { 'application/pdf': ['.pdf'], 'text/csv': ['.csv'], 'application/json': ['.json'] }, onDrop: async (acceptedFiles) => { if (!acceptedFiles.length) return; setIsUploading(true); @@ -65,7 +65,7 @@ export default function Workspace() { const file = acceptedFiles[0]; const formData = new FormData(); formData.append("file", file); - formData.append("doc_type", file.type.includes("pdf") ? "financial_pdf" : "bank_csv"); + formData.append("doc_type", file.type.includes("pdf") ? "financial_pdf" : file.type.includes("csv") ? "bank_csv" : "bureau_json"); if (analysisId) formData.append("analysis_id", analysisId); let currentToken = authToken; @@ -111,7 +111,7 @@ export default function Workspace() { // Construct the payload structure corresponding to AnalyzeRequest matching the backend const payload = { analysis_id: analysisId || "ana_" + Math.random().toString(36).substr(2, 9), - customer: { name: "Acme Corp Ltd.", id: "cust_123", industry: "Manufacturing", constitution: "Private Limited" }, + customer: { name: "Acme Corp Ltd.", id: "cust_123", industry: "Manufacturing", constitution: "Private Limited", pan: "ABCDE1234F", cin: "U74999MH2023PTC123456", gstin: "27ABCDE1234F1Z5" }, financials: { operating_income: 5000000, non_operating_income: 0, short_term_liab: 1000000, long_term_liab: 2000000, contingent_liab: 0, internal_rating: "BBB", external_rating: "BB+", bureau_score: 750, current_assets: 3000000, fixed_assets: 4000000, intangible_assets: 0 }, facility: { amount: 2000000, currency: "INR", purpose: "Working Capital", term_months: 24, repayment_method: "EMI" }, collateral_list: [{ type: "Real Estate", value: 2500000 }], @@ -138,6 +138,9 @@ export default function Workspace() { if (!res.ok) throw new Error("Analysis failed"); const data = await res.json(); setAnalysisResult(data); + if (data.features_used && data.features_used.monthly_variance) { + setReconciliationData(data.features_used.monthly_variance); + } } catch (err) { console.error(err); // Fallback local state if backend is down @@ -166,8 +169,14 @@ export default function Workspace() { router.push(`/cam-terminal/${analysisResult.analysis_id}`); }; - // Mock Radar Data mapped from SHAP or defaults - const radarData = [ + // Dynamic Radar Data mapped from feature calculations + const radarData = analysisResult?.features_used ? [ + { subject: 'Character', A: Math.min((analysisResult.features_used.bureau_score / 900) * 100, 100) || 85, fullMark: 100 }, + { subject: 'Capacity', A: Math.min((analysisResult.features_used.dscr || 1.5) / 2 * 100, 100), fullMark: 100 }, + { subject: 'Capital', A: Math.max(100 - ((analysisResult.features_used.debt_to_equity || 1) * 20), 0), fullMark: 100 }, + { subject: 'Collateral', A: Math.min((analysisResult.features_used.collateral_coverage || 1) * 100, 100), fullMark: 100 }, + { subject: 'Conditions', A: 100 - ((analysisResult.features_used.industry_risk || 0.3) * 100), fullMark: 100 }, + ] : [ { subject: 'Character', A: 85, fullMark: 100 }, { subject: 'Capacity', A: 70, fullMark: 100 }, { subject: 'Capital', A: 90, fullMark: 100 }, diff --git a/frontend/src/app/profile/page.js b/frontend/src/app/profile/page.js new file mode 100644 index 0000000..5e60ad8 --- /dev/null +++ b/frontend/src/app/profile/page.js @@ -0,0 +1,141 @@ +"use client"; + +import React from "react"; +import { useAuth } from "@/context/AuthContext"; +import { + User, + Mail, + Shield, + Settings, + LogOut, + Bell, + Key, + Activity, + UserCheck +} from "lucide-react"; +import { Card, CardContent, CardHeader, CardTitle, CardDescription } from "@/components/ui/Card"; +import { Button } from "@/components/ui/Button"; +import { Badge } from "@/components/ui/Badge"; + +export default function ProfilePage() { + const { user, signOut } = useAuth(); + + if (!user) { + return ( +
+ +

Validating session...

+
+ ); + } + + return ( +
+ {/* Profile Header */} +
+
+ +
+ +
+
+ {user.email ? user.email.substring(0, 2).toUpperCase() : "US"} +
+
+ +
+
+ +
+

+ User Profile +

+
+
+ + {user.email} +
+ + Enterprise Admin + +
+
+ +
+ +
+
+ +
+ {/* Security Summary */} + + + + + Security & Authentication + + Manage your security credentials and session settings. + + +
+
+
+ +
+
+

Two-Factor Authentication

+

Enabled via Authenticator App

+
+
+ +
+ +
+
+
+ +
+
+

Email Notifications

+

Critical policy & risk alerts

+
+
+ +
+
+
+ + {/* Quick Stats */} + + + + + Platform Activity + + + +
+

Last Login

+

March 27, 2026 β€’ 09:42 AM

+
+
+

Analyses Run

+

14 Total

+
+
+

Portfolio Access

+

Read/Write/Execute

+
+
+ +
+
+
+
+
+ ); +} diff --git a/frontend/src/app/studio/page.js b/frontend/src/app/studio/page.js index 266da2e..5ab6196 100644 --- a/frontend/src/app/studio/page.js +++ b/frontend/src/app/studio/page.js @@ -1,4 +1,4 @@ -ο»Ώ"use client"; +"use client"; import React, { startTransition, @@ -33,6 +33,7 @@ import { TriggerNode, MCAFilingSyncNode, EPFOAnomalyNode, + GSTReconciliationNode, } from '@/components/studio/nodes'; import { buildStudioWebSocketUrl, @@ -49,6 +50,7 @@ const nodeTypes = { explainableAINode: ExplainableAINode, mcaFilingSyncNode: MCAFilingSyncNode, epfoAnomalyNode: EPFOAnomalyNode, + gstReconciliationNode: GSTReconciliationNode, }; const createNodeId = (type) => `${type}-${(typeof crypto !== 'undefined' && crypto.randomUUID ? crypto.randomUUID() : Math.random().toString(36).slice(2, 11)).slice(0, 8)}`; @@ -322,7 +324,7 @@ export default function DecisionStudio() { // Show error in execution panel applyExecutionEvent({ - type: 'execution.error', + type: 'execution.failed', level: 'ERROR', message: `${errorTitle}: ${errorMessage}`, timestamp: new Date().toISOString() diff --git a/frontend/src/components/studio/nodes.js b/frontend/src/components/studio/nodes.js index 3f51f72..59cad5e 100644 --- a/frontend/src/components/studio/nodes.js +++ b/frontend/src/components/studio/nodes.js @@ -1,4 +1,4 @@ -ο»Ώimport React, { memo } from 'react'; +import React, { memo } from 'react'; import { Handle, Position } from 'reactflow'; import { AlertTriangle, @@ -355,6 +355,34 @@ export const EPFOAnomalyNode = memo(({ data, selected }) => { ); }); +export const GSTReconciliationNode = memo(({ data, selected }) => { + const runtime = getRuntime(data); + const tone = runtimeTone[runtime.status] || runtimeTone.idle; + + return ( +
+ +
+ +
+
+
+ +
+
+ Data Integrity + {data.label || 'GST Reconciliation'} +
+
+
+ + + + +
+ ); +}); + TriggerNode.displayName = 'TriggerNode'; DocumentClassificationNode.displayName = 'DocumentClassificationNode'; IntegrationNode.displayName = 'IntegrationNode'; @@ -362,4 +390,5 @@ ConditionNode.displayName = 'ConditionNode'; ExplainableAINode.displayName = 'ExplainableAINode'; MCAFilingSyncNode.displayName = 'MCAFilingSyncNode'; EPFOAnomalyNode.displayName = 'EPFOAnomalyNode'; +GSTReconciliationNode.displayName = 'GSTReconciliationNode';