AI_Compressor

Neural Network Compressor with Pruning, Knowledge Distillation, and Quantization.

Overview

AI_Compressor is a tool for compressing neural network models using various techniques, including pruning, knowledge distillation (KD), and quantization. It provides two modes of operation to accommodate different use cases, whether training data is available or not.

Modes of Operation

Mode 0 (Post-training Setting): For scenarios where training data is not available and retraining is not feasible. Requires only a calibration dataset to achieve compression.
Mode 1 (Training Available): Designed for scenarios where training data is available, allowing further model optimization through knowledge distillation and quantization-aware training (QAT).

How to Run

To run the compressor, follow these steps:

Clone the repository:

git clone https://github.com/bok3948/AI_Compressor.git
cd AI_Compressor

Install the necessary dependencies:
```
pip install -r requirements.txt
```
Run the compressor :
```
python main.py --device cuda --mode 1 --data-set CIFAR --data_path ./ --model mobilenetv2_x1_0 --pretrained ./cifar100_mobilenetv2_x1_0.pt 
```
#available model list: resnet, deit(from timm), mobilenetv2 (other model may be not compatible with this code) Most of CNN model is okay but transoformer need to change forward function

Results

The compressor has been tested with various models, including MobileNetV2, ResNet18, and DeiT. The results demonstrate a trade-off between latency, size, and accuracy.

Reproduce

DownLoad Trained Model here Google Drive Link

To run MobileNetV2 compression mode:

python main.py --device cuda --mode 1 --data-set CIFAR --data_path ./ --model mobilenetv2_x1_0 --pretrained ./cifar100_mobilenetv2_x1_0.pt --pruning_ratio 0.6 --total_iters 5 --global_pruning --weight_decay 0.05 --lr 1e-5 --qat_lr 1e-5 --qat_epochs 10 --epochs 100 --do_KD

MobileNetv2 Result

Model Name	Top-1 Accuracy (%)	Top-5 Accuracy (%)	Latency (ms)	Model Size (MB)
ONNX Mobilenetv2 Compressed	73.27	92.38	0.9013	1.37
Torch Mobilenetv2 Pruned	73.83	92.69	3.5839	4.70
Torch Mobilenetv2 Original	73.51	92.38	5.1606	9.63

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
Images		Images
compress		compress
model		model
util		util
README.md		README.md
benchmark.py		benchmark.py
cifar100_mobilenetv2_x1_0.pt		cifar100_mobilenetv2_x1_0.pt
main.py		main.py
onnx_inference.py		onnx_inference.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI_Compressor

Overview

Modes of Operation

How to Run

Results

Reproduce

MobileNetv2 Result

About

Uh oh!

Releases

Packages

Uh oh!

Languages

bok3948/AI_Compressor

Folders and files

Latest commit

History

Repository files navigation

AI_Compressor

Overview

Modes of Operation

How to Run

Results

Reproduce

MobileNetv2 Result

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages