Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ We hope that `AWML` promotes the community between Autoware and ML researchers a
## Get started

- [Start training for 3D object detection](/docs/tutorial/tutorial_detection_3d.md)
- [Start training for Classification Status Classifier](/docs/tutorial/tutorial_calibration_status_classification.md)

## Docs
### Design documents
Expand Down
352 changes: 352 additions & 0 deletions autoware_ml/deployment/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,352 @@
# Autoware ML Deployment Framework

A unified, task-agnostic deployment framework for exporting, verifying, and evaluating machine learning models across different backends (ONNX, TensorRT) with comprehensive support for model validation and performance benchmarking.

## Table of Contents

- [Overview](#overview)
- [Features](#features)
- [Current Support](#current-support)
- [Architecture](#architecture)
- [Quick Start](#quick-start)
- [Usage Guide](#usage-guide)
- [Basic Export](#basic-export)
- [Export with Verification](#export-with-verification)
- [Export with Full Evaluation](#export-with-full-evaluation)
- [Evaluation Only Mode](#evaluation-only-mode)
- [Configuration Reference](#configuration-reference)

## Overview

The Autoware ML Deployment Framework provides a standardized pipeline for deploying trained models to production-ready inference backends. It handles the complete deployment workflow from model export to validation and performance analysis, with a focus on ensuring model quality and correctness across different deployment targets.

### Key Capabilities

- **Multi-Backend Export**: Export models to ONNX and TensorRT formats
- **Precision Policy Support**: Flexible precision policies (FP32, FP16, TF32, INT8)
- **Automated Verification**: Cross-backend output validation to ensure correctness
- **Performance Benchmarking**: Comprehensive latency and throughput analysis
- **Full Evaluation**: Complete model evaluation with metrics and confusion matrices
- **Modular Design**: Easy to extend for new tasks and backends



## Current Support

### Detection 3D
* [ ] BEVFusion
* [ ] CenterPoint
* [ ] TransFusion
* [ ] StreamPETR

### Detection 2D
* [ ] YOLOX
* [ ] YOLOX_opt (Traffic Light Detection)
* [ ] FRNet
* [ ] GLIP (Grounded Language-Image Pre-training)

### Classification
* [X] CalibrationStatusClassification
* [ ] MobileNetv2 (Traffic Light Classification)

### Backbones & Components
* [ ] SwinTransformer
* [ ] ConvNeXt_PC (Point Cloud)
* [ ] SparseConvolution

### Multimodal
* [ ] BLIP-2 (Vision-Language Model)

> **Note**: Currently, only **CalibrationStatusClassification** has full deployment framework support. Other models may have custom deployment scripts in their respective project directories but are not yet integrated with the unified deployment framework.

### Supported Backends

| Backend | Export | Inference | Verification |
|---------|--------|-----------|--------------|
| **ONNX** | ✅ | ✅ | ✅ |
| **TensorRT** | ✅ | ✅ | ✅ |

## Architecture

The deployment framework follows a modular architecture:

```
autoware_ml/deployment/
├── core/ # Core abstractions
│ ├── base_config.py # Configuration management
│ ├── base_data_loader.py # Data loading interface
│ ├── base_evaluator.py # Evaluation interface
│ └── verification.py # Cross-backend verification
├── backends/ # Backend implementations
│ ├── pytorch_backend.py # PyTorch inference
│ ├── onnx_backend.py # ONNX Runtime inference
│ └── tensorrt_backend.py # TensorRT inference
└── exporters/ # Export implementations
├── onnx_exporter.py # ONNX export
└── tensorrt_exporter.py # TensorRT export
```

### Design Principles

1. **Task-Agnostic Core**: Base classes are independent of specific tasks
2. **Backend Abstraction**: Unified interface across different inference backends
3. **Extensibility**: Easy to add new tasks, backends, or exporters
4. **Configuration-Driven**: All settings managed through Python config files
5. **Comprehensive Validation**: Built-in verification at every step


## Quick Start

Here's a minimal example to export and verify a calibration classification model:

```bash
# Export to both ONNX and TensorRT with verification
python projects/CalibrationStatusClassification/deploy/main.py \
deploy_config.py \
model_config.py \
checkpoint.pth \
--work-dir work_dirs/deployment
```


## Usage Guide

### Basic Export

Export a model to ONNX format:

**1. Create deployment config** (`deploy_config_onnx.py`):

```python
export = dict(
mode='onnx', # Export mode: 'onnx', 'trt', 'both', 'none'
verify=False, # Skip verification
device='cuda:0', # Device for export
work_dir='work_dirs/deployment'
)

# Runtime I/O settings
runtime_io = dict(
info_pkl='path/to/info.pkl', # Dataset info file
sample_idx=0 # Sample index for export
)

# ONNX configuration
onnx_config = dict(
opset_version=16,
do_constant_folding=True,
input_names=['input'],
output_names=['output'],
save_file='model.onnx',
dynamic_axes={
'input': {0: 'batch_size'},
'output': {0: 'batch_size'}
}
)

# Backend configuration
backend_config = dict(
common_config=dict(
precision_policy='auto', # Options: 'auto', 'fp16', 'fp32_tf32', 'int8'
max_workspace_size=1 << 30 # 1 GB for TensorRT
)
)
```

**2. Run export**:

```bash
python projects/CalibrationStatusClassification/deploy/main.py \
deploy_config_onnx.py \
path/to/model_config.py \
path/to/checkpoint.pth
```

### Export with Verification

Verify that exported models produce correct outputs:

**Update config**:

```python
export = dict(
mode='both', # Export to both ONNX and TensorRT
verify=True, # Enable verification
device='cuda:0',
work_dir='work_dirs/deployment'
)

# ... rest of config ...
```

**Run with verification**:

```bash
python projects/CalibrationStatusClassification/deploy/main.py \
deploy_config_verify.py \
path/to/model_config.py \
path/to/checkpoint.pth
```

### Export with Full Evaluation

Perform complete model evaluation on a validation dataset:

**Update config** (`deploy_config_eval.py`):

```python
export = dict(
mode='both',
verify=True,
device='cuda:0',
work_dir='work_dirs/deployment'
)

# Enable evaluation
evaluation = dict(
enabled=True,
num_samples=1000, # Number of samples to evaluate
verbose=False, # Set True for detailed per-sample output
models_to_evaluate=[
'pytorch', # Evaluate PyTorch model
'onnx', # Evaluate ONNX model
'tensorrt' # Evaluate TensorRT model
]
)

# ... rest of config ...
```

**Run with evaluation**:

```bash
python projects/CalibrationStatusClassification/deploy/main.py \
deploy_config_eval.py \
path/to/model_config.py \
path/to/checkpoint.pth
```

**Output includes**:
- Per-model accuracy and performance metrics
- Confusion matrices for each backend
- Latency statistics (min, max, mean, median, p95, p99)
- Per-class accuracy breakdown

### Evaluation Only Mode

Run evaluation without exporting (useful for testing existing deployments):

**Config** (`deploy_config_eval_only.py`):

```python
export = dict(
mode='none', # Skip export
device='cuda:0',
work_dir='work_dirs/deployment'
)

evaluation = dict(
enabled=True,
num_samples=1000,
models_to_evaluate=['onnx', 'tensorrt'] # Evaluate existing models
)

runtime_io = dict(
info_pkl='path/to/info.pkl',
onnx_file='work_dirs/deployment/model.onnx' # Path to existing ONNX
)

# ... rest of config ...
```

**Run**:

```bash
# No checkpoint needed in eval-only mode
python projects/CalibrationStatusClassification/deploy/main.py \
deploy_config_eval_only.py \
path/to/model_config.py
```

## Configuration Reference

### Export Configuration

```python
export = dict(
mode='both', # 'onnx', 'trt', 'both', 'none'
verify=True, # Enable cross-backend verification
device='cuda:0', # Device for export/inference
work_dir='work_dirs' # Output directory
)
```

### Runtime I/O Configuration

```python
runtime_io = dict(
info_pkl='path/to/dataset/info.pkl', # Required: dataset info file
sample_idx=0, # Sample index for export
onnx_file='path/to/existing/model.onnx' # Optional: use existing ONNX
)
```

### ONNX Configuration

```python
onnx_config = dict(
opset_version=16, # ONNX opset version
do_constant_folding=True, # Enable constant folding optimization
input_names=['input'], # Input tensor names
output_names=['output'], # Output tensor names
save_file='model.onnx', # Output filename
export_params=True, # Export model parameters
dynamic_axes={ # Dynamic dimensions
'input': {0: 'batch_size'},
'output': {0: 'batch_size'}
},
keep_initializers_as_inputs=False # ONNX optimization
)
```

### Backend Configuration

```python
backend_config = dict(
common_config=dict(
precision_policy='fp16', # Precision policy (see below)
max_workspace_size=1 << 30 # TensorRT workspace size (bytes)
),
model_inputs=[ # Optional: input specifications
dict(
name='input',
shape=(1, 5, 512, 512),
dtype='float32'
)
]
)
```

### Precision Policies

| Policy | Description | Use Case |
|--------|-------------|----------|
| `auto` | Let TensorRT decide | Default, balanced performance |
| `fp16` | Half precision (FP16) | 2x faster, ~same accuracy |
| `fp32_tf32` | TensorFlow 32 (TF32) | Good balance for Ampere+ GPUs |
| `strongly_typed` | Strict type enforcement | For debugging |

### Evaluation Configuration

```python
evaluation = dict(
enabled=True, # Enable evaluation
num_samples=1000, # Number of samples to evaluate
verbose=False, # Detailed per-sample output
models_to_evaluate=[ # Backends to evaluate
'pytorch',
'onnx',
'tensorrt'
]
)
```
20 changes: 20 additions & 0 deletions autoware_ml/deployment/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
"""
Autoware ML Unified Deployment Framework

This package provides a unified, task-agnostic deployment framework for
exporting, verifying, and evaluating machine learning models across different
tasks (classification, detection, segmentation, etc.) and backends (ONNX,
TensorRT, TorchScript, etc.).
"""

from .core.base_config import BaseDeploymentConfig
from .core.base_data_loader import BaseDataLoader
from .core.base_evaluator import BaseEvaluator

__all__ = [
"BaseDeploymentConfig",
"BaseDataLoader",
"BaseEvaluator",
]

__version__ = "1.0.0"
13 changes: 13 additions & 0 deletions autoware_ml/deployment/backends/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
"""Inference backends for different model formats."""

from .base_backend import BaseBackend
from .onnx_backend import ONNXBackend
from .pytorch_backend import PyTorchBackend
from .tensorrt_backend import TensorRTBackend

__all__ = [
"BaseBackend",
"PyTorchBackend",
"ONNXBackend",
"TensorRTBackend",
]
Loading