LampStack

Autonomous Healthcare Provider Data Validation Platform

LampStack.Demo.2.mp4

Problem Statement

Healthcare provider data accuracy remains a critical challenge in the medical industry, with significant financial and operational impacts:

Financial Impact

$2 billion lost annually due to incorrect or outdated provider data
10-15% error rates in provider information across healthcare systems
Average 2-3 days manual validation time per provider record

Operational Challenges

Manual cross-referencing required across multiple databases (NPI Registry, State Medical Boards, Google Places API)
Inconsistent data formats between different authoritative sources
No standardized trust scoring or quality metrics
Difficulty in detecting duplicate or conflicting provider records
Poor scalability when validating thousands of providers
Lack of automated conflict resolution mechanisms

Compliance Risks

Regulatory compliance issues from outdated credentials
Provider credentialing delays impacting operations
Incomplete audit trails for data validation processes

Solution Overview

LampStack is an autonomous platform that validates healthcare provider data through a multi-agent AI architecture orchestrated by LangGraph. The system processes both structured (CSV) and unstructured (PDF, images) data formats to ensure comprehensive validation across multiple authoritative sources.

Core Capabilities

The platform implements a four-layer validation pipeline:

Data Ingestion - Automated scraping from NPI Registry, State Medical Board databases, and Google Places API
Cross-Validation - Systematic verification of provider credentials across all collected sources
Data Enrichment - Intelligent gap-filling for missing information and data normalization
Trust Scoring - Weighted calculation generating A-F grades with actionable recommendations

Key Differentiators

Processing speed: 30-45 seconds per provider (vs 2-3 days manual validation)
Multi-source verification: Simultaneous validation across 3+ authoritative databases
Real-time monitoring: WebSocket-based progress tracking and notifications
Human-in-the-loop: Feedback integration for continuous model improvement
Vector similarity search: Semantic matching for duplicate detection using Milvus

System Architecture

Methodology

Multi-Agent Architecture

The system employs specialized AI agents, each responsible for a distinct validation stage:

Ingestion Agent

Scrapes National Provider Identifier (NPI) Registry for federal provider data
Queries State Medical Board APIs for license verification
Retrieves contact information from Google Places API
Processes unstructured documents using Mistral AI OCR (Pixtral-12B model)

Validation Agent

Cross-references provider names across all data sources
Verifies license numbers and expiration dates
Validates contact information (phone, email, address)
Flags discrepancies and conflicts with severity ratings

Enrichment Agent

Fills missing fields using most reliable source data
Calculates data completeness percentage
Normalizes inconsistent data formats (addresses, phone numbers)
Generates vector embeddings for semantic search

Scoring Agent

Applies weighted trust score algorithm
Assigns letter grades (A-F) based on validation results
Generates specific recommendations for data improvement
Updates PostgreSQL database with validation results

Orchestration Strategy

LangGraph manages the sequential execution of agents through a state machine:

workflow = StateGraph(ValidationState)
workflow.add_node("ingestion", ingestion_agent)
workflow.add_node("validation", validation_agent)
workflow.add_node("enrichment", enrichment_agent)
workflow.add_node("scoring", scoring_agent)

workflow.add_edge("ingestion", "validation")
workflow.add_edge("validation", "enrichment")
workflow.add_edge("enrichment", "scoring")

Trust Score Calculation

Trust Score = (NPI_Match_Score × 0.4) + (License_Validity_Score × 0.4) + (Data_Completeness × 0.2)

Where:
- NPI_Match_Score: 100 if verified in NPI Registry, 0 otherwise
- License_Validity_Score: 100 if active, 50 if inactive, 0 if invalid/expired
- Data_Completeness: (Number_of_Filled_Fields / Total_Required_Fields) × 100

Grade Assignment:
A: 90-100 (Excellent - all sources verified)
B: 80-89  (Good - minor discrepancies)
C: 70-79  (Acceptable - some missing data)
D: 60-69  (Poor - significant gaps)
F: <60    (Failed - major conflicts or missing critical data)

Data Flow

User uploads CSV or PDF containing provider data via React frontend
Java backend parses the file and stores records in PostgreSQL
Backend creates validation job and triggers Python agent service via HTTP
Python agents execute in sequence (LangGraph orchestration):
- Ingestion Agent scrapes external APIs
- Validation Agent cross-checks data
- Enrichment Agent fills gaps
- Scoring Agent calculates trust score
Python service stores vector embeddings in Milvus for semantic search
Python service sends progress updates to Java backend via HTTP callbacks
Java backend broadcasts real-time updates to frontend via WebSocket
Frontend displays validation results and trust scores

Installation

Prerequisites

Java Development Kit (JDK) 17 or higher
Python 3.11 or higher
Node.js 18 or higher
Docker and Docker Compose
Maven 3.8+ (or use included Maven wrapper)
Git

Step 1: Clone Repository

git clone https://github.com/CroWzblooD/LampStack.git
cd LampStack

Step 2: Start Infrastructure Services

docker-compose up -d

This starts PostgreSQL, Milvus, Etcd, and MinIO containers. Verify services:

docker-compose ps

Expected services:

postgres on port 5432
milvus-standalone on port 19530
etcd on port 2379
minio on ports 9000/9001

Step 3: Backend Setup

cd server

# Build the project
./mvnw clean install

# Run the application
./mvnw spring-boot:run

The backend will start on http://localhost:8080

Step 4: Agent Service Setup

cd agent-service

# Create virtual environment
python -m venv venv

# Activate virtual environment
# Windows:
venv\Scripts\activate
# Linux/Mac:
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Run the service
uvicorn app.main:app --host 0.0.0.0 --port 8001 --reload

The agent service will start on http://localhost:8001

Step 5: Frontend Setup

cd client

# Install dependencies
npm install

# Start development server
npm run dev

The frontend will start on http://localhost:5173

Configuration

Backend Configuration

Create server/.env:

# Database
SPRING_DATASOURCE_URL=jdbc:postgresql://localhost:5432/healthcare_validation
SPRING_DATASOURCE_USERNAME=postgres
SPRING_DATASOURCE_PASSWORD=yourpassword

# Security
JWT_SECRET=your-256-bit-secret-key-here
JWT_EXPIRATION=86400000

# Agent Service
PYTHON_AGENT_SERVICE_URL=http://localhost:8001

# Server
SERVER_PORT=8080

Alternatively, edit server/src/main/resources/application.yml:

spring:
  datasource:
    url: jdbc:postgresql://localhost:5432/healthcare_validation
    username: postgres
    password: yourpassword
  jpa:
    hibernate:
      ddl-auto: update
    show-sql: false

app:
  jwt:
    secret: your-secret-key
    expiration: 86400000
  python:
    agent-service:
      url: http://localhost:8001

Agent Service Configuration

Create agent-service/.env:

# Java Backend
JAVA_SERVICE_URL=http://localhost:8080

# AI Services
MISTRAL_API_KEY=your-mistral-api-key-here

# Vector Database
MILVUS_HOST=localhost
MILVUS_PORT=19530

# PostgreSQL
DATABASE_URL=postgresql://postgres:yourpassword@localhost:5432/healthcare_validation

# External APIs
NPI_REGISTRY_API_URL=https://npiregistry.cms.hhs.gov/api
GOOGLE_PLACES_API_KEY=your-google-places-api-key-here

Frontend Configuration

Create client/.env.local:

VITE_API_URL=http://localhost:8080/api
VITE_WS_URL=ws://localhost:8080/ws

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
backend		backend
client		client
langgraph-service		langgraph-service
visualization		visualization
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LampStack

Problem Statement

Solution Overview

System Architecture

Methodology

Multi-Agent Architecture

Orchestration Strategy

Trust Score Calculation

Data Flow

Installation

Prerequisites

Step 1: Clone Repository

Step 2: Start Infrastructure Services

Step 3: Backend Setup

Step 4: Agent Service Setup

Step 5: Frontend Setup

Configuration

Backend Configuration

Agent Service Configuration

Frontend Configuration

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LampStack

Problem Statement

Solution Overview

System Architecture

Methodology

Multi-Agent Architecture

Orchestration Strategy

Trust Score Calculation

Data Flow

Installation

Prerequisites

Step 1: Clone Repository

Step 2: Start Infrastructure Services

Step 3: Backend Setup

Step 4: Agent Service Setup

Step 5: Frontend Setup

Configuration

Backend Configuration

Agent Service Configuration

Frontend Configuration

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages