Skip to content

An intelligent radiological assistance system that combines YOLOv11-L object detection with comprehensive MLOps infrastructure to help radiologists identify and localize chest X-ray abnormalities with enhanced accuracy and efficiency.

License

Notifications You must be signed in to change notification settings

rigvedrs/SnippyScan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Complete MLOps Pipeline for Chest X-ray Abnormality Detection System on Cloud

An intelligent radiological assistance system that combines YOLOv11-L object detection with comprehensive MLOps infrastructure to help radiologists identify and localize chest X-ray abnormalities with enhanced accuracy and efficiency.

🎯 Overview

This system provides AI-assisted chest X-ray analysis for radiology departments, offering both pathology classification and precise localization through bounding boxes. Built with the VinDr-CXR dataset and validated by multiple radiologists, the system enhances diagnostic workflow while maintaining radiologists as the ultimate decision-makers.

Key Benefits

  • Reduced Missed Pathologies: Identifies and highlights potential abnormalities that might be overlooked during routine reads
  • Improved Efficiency: Pre-highlights regions of concern, allowing radiologists to focus attention on suspicious areas
  • Second Opinion Support: Provides automated consultation that can confirm findings or prompt reconsideration
  • Scalable Infrastructure: Cloud-native architecture with automated CI/CD pipelines

πŸ—οΈ Architecture

The system follows a microservices architecture with the following components:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Data Pipeline   β”‚    β”‚  Model Training β”‚    β”‚  Model Serving  β”‚
β”‚                   β”‚    β”‚                 β”‚    β”‚                 β”‚
β”‚ β€’ ETL Processing  β”‚    β”‚ β€’ YOLOv11-L     β”‚    β”‚ β€’ Triton Server β”‚
β”‚ β€’ Data Validation β”‚    β”‚ β€’ Distributed   β”‚    β”‚ β€’ ONNX Runtime  β”‚
β”‚ β€’ Quality Checks  β”‚    β”‚   Training      β”‚    β”‚ β€’ TensorRT      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                       β”‚                       β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Monitoring    β”‚    β”‚  Orchestration  β”‚    β”‚   Storage       β”‚
β”‚                 β”‚    β”‚                 β”‚    β”‚                 β”‚
β”‚ β€’ Prometheus    β”‚    β”‚ β€’ Kubernetes    β”‚    β”‚ β€’ MinIO         β”‚
β”‚ β€’ Grafana       β”‚    β”‚ β€’ ArgoCD        β”‚    β”‚ β€’ PostgreSQL    β”‚
β”‚ β€’ Alert Manager β”‚    β”‚ β€’ Argo Workflowsβ”‚    β”‚ β€’ Swift Storage β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Features

Model Capabilities

  • Multi-class Detection: Identifies 14+ chest abnormalities including pneumothorax, consolidation, and pleural effusion
  • Precise Localization: Provides bounding boxes for abnormality regions
  • Confidence Scoring: Outputs confidence levels for each detection
  • High Performance: Optimized for both accuracy and inference speed

Infrastructure Features

  • Distributed Training: Multi-GPU training with Ray for faster model development
  • Model Versioning: MLflow integration for experiment tracking and model registry
  • Automated Deployment: Deployment with staging, canary, and production environments
  • Real-time Monitoring: Comprehensive monitoring of model performance and system health
  • Data Pipeline: Automated ETL processes with data quality validation

Optimization Features

  • Multiple Serving Options: CPU, GPU, and edge deployment configurations
  • Model Optimization: ONNX conversion, quantization, and TensorRT acceleration
  • Dynamic Batching: Optimized throughput for varying loads
  • Fault Tolerance: Automatic recovery and failover mechanisms
  • Advanced Performance Tuning: Comprehensive Triton server configurations for maximum efficiency

πŸ“Š Performance Metrics

Metric Target Achieved
Inference Latency (Single) <200ms ~150ms
Batch Throughput 30 FPS 45+ FPS
Model Accuracy ([email protected]) >0.75 0.82
Concurrent Requests 10+ 25+
Uptime 99.9% 99.95%

πŸš€ Advanced Performance Optimizations

Performance experiments have achieved following results across multiple deployment scenarios:

GPU Configurations (Full Model)

  • Custom TensorRT V2: 157.8 infer/sec throughput with optimized memory allocation
  • Custom TensorRT V1: Ultra-low latency (19.3ms) with dynamic batching
  • Default TensorRT: 129.0 infer/sec for balanced performance
  • ONNX Runtime: 78-79 infer/sec with CUDA acceleration

CPU Configurations (Nano Model)

  • OpenVINO Nano: 14.6 infer/sec - 15x faster than standard CPU backends
  • Edge Deployment: Optimized for resource-constrained environments
  • Memory Efficiency: Reduced model size with compact output format

πŸ“ˆ Detailed performance analysis, configuration files, and optimization insights are available in the Triton Configuration Repository

πŸ› οΈ Technology Stack

Core ML Stack

  • Model: YOLOv11-L (Ultralytics)
  • Training: PyTorch with distributed training via Ray
  • Serving: NVIDIA Triton Inference Server
  • Optimization: ONNX Runtime, TensorRT, OpenVINO

Infrastructure

  • Orchestration: Kubernetes, ArgoCD, Argo Workflows
  • Storage: MinIO (object), PostgreSQL (metadata), Swift (datasets)
  • Monitoring: Prometheus, Grafana, Alert Manager
  • MLOps: MLflow, Label Studio

DevOps

  • IaC: Terraform, Ansible
  • CI/CD: GitOps with ArgoCD
  • Containerization: Docker, Kubernetes
  • Cloud: Multi-cloud compatible (tested on Chameleon Cloud)

πŸš€ Quick Start

Prerequisites

  • Kubernetes cluster (1.24+)
  • NVIDIA GPUs (optional, for GPU acceleration)
  • 250GB+ storage for datasets
  • Docker and Terraform installed

Installation

  1. Clone the Repository
git clone https://github.com/your-org/chest-xray-detection.git
cd chest-xray-detection
  1. Set Up Infrastructure
# Install Terraform
mkdir -p ~/.local/bin
wget https://releases.hashicorp.com/terraform/1.10.5/terraform_1.10.5_linux_amd64.zip
unzip terraform_1.10.5_linux_amd64.zip
mv terraform ~/.local/bin

# Create cloud resources
cd iac/tf/kvm
export TF_VAR_suffix=your-project
export TF_VAR_key=your-ssh-key
terraform init
terraform apply -auto-approve
  1. Configure Kubernetes
# Set up Kubernetes cluster
cd ./iac/ansible
ansible-playbook -i inventory.yml pre_k8s/pre_k8s_configure.yml
cd k8s/kubespray
ansible-playbook -i ../inventory/mycluster --become --become-user=root ./cluster.yml
cd ../..
ansible-playbook -i inventory.yml post_k8s/post_k8s_configure.yml
  1. Deploy ML Platform
# Deploy core services
ansible-playbook -i inventory.yml argocd/argocd_add_platform.yml

# Set up environments
ansible-playbook -i inventory.yml argocd/workflow_build_init.yml
ansible-playbook -i inventory.yml argocd/argocd_add_staging.yml
ansible-playbook -i inventory.yml argocd/argocd_add_canary.yml
ansible-playbook -i inventory.yml argocd/argocd_add_prod.yml
ansible-playbook -i inventory.yml argocd/workflow_templates_apply.yml

Running the Data Pipeline

  1. Offline ETL Pipeline
cd deployment/docker
docker-compose -f docker-compose-etl.yaml up
  1. Online Data Simulation
cd ./data-pipeline/data-simulation
docker-compose -f docker-compose-data-simulation.yaml up -d
  1. Production Pipeline
cd ./deployment/docker
docker-compose -f docker-compose-production.yaml up

Model Training

# Start distributed training
python model_train/train_yolo.py --distributed --gpus 4

Model Serving

The system automatically deploys trained models through the GitOps pipeline. Access the API endpoints:

  • Production: http://your-ip/predict
  • Staging: http://your-ip:8081/predict
  • Canary: http://your-ip:8080/predict

πŸ“‹ API Usage

Prediction Endpoint

# Single image prediction
curl -X POST "http://your-ip/predict" \
  -H "Content-Type: multipart/form-data" \
  -F "file=@chest_xray.jpg"

# Response
{
  "predictions": [
    {
      "class": "pneumothorax",
      "confidence": 0.89,
      "bbox": [0.23, 0.15, 0.45, 0.32]
    }
  ],
  "inference_time": 0.145
}

Batch Processing

import requests
import json

# Batch prediction
files = [('files', open(f'image_{i}.jpg', 'rb')) for i in range(5)]
response = requests.post('http://your-ip/predict/batch', files=files)
results = response.json()

πŸ“Š Monitoring and Observability

Access Dashboards

  • Grafana: http://your-ip:3000 (admin/admin)
  • MLflow: http://your-ip:8000
  • MinIO: http://your-ip:9001
  • Label Studio: http://your-ip:5000
  • Data Dashboard: http://your-ip:8501

Key Metrics Monitored

  • Model performance (accuracy, precision, recall)
  • Inference latency and throughput
  • System resource utilization
  • Data quality and drift detection
  • Error rates and alert conditions

πŸ” Advanced Monitoring & Data Drift Detection

Production Monitoring Features

  • Real-time Model Performance Tracking: Continuous accuracy monitoring with automated alerts
  • Data Drift Detection: Advanced statistical analysis to detect distribution shifts in incoming data
  • Concept Drift Monitoring: Tracks changes in the relationship between features and predictions
  • Model Decay Detection: Identifies when model performance degrades over time
  • A/B Testing Framework: Compare model versions in production environments

Multi-Branch Monitoring Architecture

Our monitoring system supports multiple deployment scenarios with specialized branches:

  • Production Branch: Full-scale monitoring with comprehensive alerting
  • Staging Branch: Pre-production validation with synthetic data testing
  • Canary Branch: Gradual rollout monitoring with performance comparison
  • Data Drift Branch: Specialized monitoring for data quality and distribution analysis

πŸ”¬ Additonal serving and monitoring configuration available in the Serving Monitoring Repository

πŸ”„ Continuous Integration/Deployment

The system implements automated pipelines:

  1. Code Changes β†’ Git repository
  2. ArgoCD Detection β†’ Automatic sync
  3. Staging Deployment β†’ Automated testing
  4. Canary Release β†’ Gradual traffic shift
  5. Production Rollout β†’ Full deployment

Model Lifecycle

  1. Training: Distributed training with hyperparameter optimization
  2. Validation: Automated testing on held-out datasets
  3. Staging: Deployment to staging environment for integration testing
  4. Canary: Limited production traffic for performance validation
  5. Production: Full deployment with monitoring
  6. Retraining: Automatic retraining based on performance degradation

πŸ”§ Configuration

Environment Variables

# Model serving configuration
MODEL_PATH=/models/yolov11l.onnx
BATCH_SIZE=8
MAX_WORKERS=4
DEVICE=cuda  # or cpu

# Monitoring configuration
PROMETHEUS_PORT=9090
GRAFANA_PORT=3000
ALERT_WEBHOOK_URL=your-webhook-url

# Storage configuration
MINIO_ACCESS_KEY=your-access-key
MINIO_SECRET_KEY=your-secret-key
SWIFT_CONTAINER=your-container

Model Configuration

# model_config.yaml
model:
  name: yolov11l-chest-xray
  version: "1.0.0"
  input_size: [1024, 1024]
  classes: 14
  confidence_threshold: 0.25
  iou_threshold: 0.45

serving:
  max_batch_size: 16
  preferred_batch_size: 8
  max_queue_delay_microseconds: 100

πŸ§ͺ Testing

Unit Tests

pytest tests/unit/ -v

Integration Tests

pytest tests/integration/ -v

Load Testing

# Using locust
locust -f tests/load/test_api.py --host=http://your-ip

Model Validation

python tests/model/validate_model.py --model-path /path/to/model

πŸ“ˆ Performance Optimization

Model Optimizations

  1. ONNX Conversion: Reduces model size and improves portability
  2. Quantization: INT8 quantization for CPU deployment
  3. TensorRT: GPU acceleration for NVIDIA hardware
  4. Pruning: Removes unnecessary model parameters

System Optimizations

  1. Dynamic Batching: Automatically batches requests for improved throughput
  2. Multi-instance Serving: Parallel model instances for concurrent processing
  3. Caching: Redis-based result caching for repeat queries
  4. Load Balancing: Distributes requests across multiple instances

πŸ”’ Security and Compliance

  • Data Privacy: PHI data handling compliance (HIPAA considerations)
  • Access Control: Role-based access control (RBAC) for all services
  • Audit Logging: Comprehensive logging for all system interactions
  • Secure Communication: TLS encryption for all API endpoints
  • Vulnerability Scanning: Regular security scans of container images

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Setup

# Set up development environment
python -m venv venv
source venv/bin/activate
pip install -r requirements-dev.txt
pre-commit install

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • VinBigData for the chest X-ray dataset
  • Ultralytics for the YOLOv11 implementation
  • NVIDIA for Triton Inference Server
  • Chameleon Cloud for infrastructure support

πŸ”— Related Projects & Resources

Performance Optimization Research

  • YOLO-Triton-Configs Repository: Comprehensive Triton server configurations, performance benchmarks, and optimization insights for maximum inference efficiency across GPU and CPU deployments.

Advanced Monitoring & Production Systems

  • Serving Monitoring Repository: Production-ready monitoring solutions with advanced data drift detection, model performance tracking, and multi-branch deployment monitoring capabilities.

πŸ“ž Support

πŸ—ΊοΈ Roadmap

  • Edge deployment optimization for mobile devices
  • Integration with PACS systems
  • Multi-language support for international deployment
  • Advanced explainability features (GradCAM, SHAP)
  • Federated learning capabilities
  • Integration with electronic health records (EHR)

⚠️ Medical Disclaimer: This system is designed to assist radiologists and should not be used as the sole basis for medical decisions. Always consult with qualified healthcare professionals for medical diagnosis and treatment.

About

An intelligent radiological assistance system that combines YOLOv11-L object detection with comprehensive MLOps infrastructure to help radiologists identify and localize chest X-ray abnormalities with enhanced accuracy and efficiency.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published