A comprehensive benchmarking suite for comparing the performance of popular image and video augmentation libraries including Albumentations, imgaug, torchvision, Kornia, and Augly.
Table of Contents
This benchmark suite measures the throughput and performance characteristics of common augmentation operations across different libraries. It features:
- Benchmarks for both image and video augmentation
- Adaptive warmup to ensure stable measurements
- Multiple runs for statistical significance
- Detailed performance metrics and system information
- Thread control settings for consistent performance
- Support for multiple image/video formats and loading methods
The image benchmarks compare the performance of various libraries on standard image transformations. All benchmarks are run on a single CPU thread to ensure consistent and comparable results.
Detailed Image Benchmark Results
The video benchmarks compare CPU-based processing (Albumentations) with GPU-accelerated processing (Kornia) for video transformations. The benchmarks use the UCF101 dataset, which contains realistic videos from 101 action categories.
Detailed Video Benchmark Results
Albumentations is generally the fastest library for image augmentation, with a median speedup of 4.1× compared to other libraries. For some transforms, the speedup can be as high as 119.7× (MedianBlur).
For video processing, the performance comparison between CPU (Albumentations) and GPU (Kornia) shows interesting trade-offs. While GPU acceleration provides significant benefits for complex transformations, CPU processing can be more efficient for simple operations.
The benchmark automatically creates isolated virtual environments for each library and installs the necessary dependencies. Base requirements:
- Python 3.10+
- uv (for fast package installation)
- Disk space for virtual environments
- Image/video dataset in a supported format
Each library's specific dependencies are managed through separate requirements files in the requirements/
directory.
For testing and comparison purposes, you can use standard datasets:
For image benchmarks:
wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar
tar -xf ILSVRC2012_img_val.tar -C /path/to/your/target/directory
For video benchmarks:
# UCF101 dataset
wget https://www.crcv.ucf.edu/data/UCF101/UCF101.rar
unrar x UCF101.rar -d /path/to/your/target/directory
We strongly recommend running the benchmarks on your own dataset that matches your use case:
- Use images/videos that are representative of your actual workload
- Consider sizes and formats you typically work with
- Include edge cases specific to your application
This will give you more relevant performance metrics for your specific use case.
To benchmark a single library:
./run_single.sh -l albumentations -d /path/to/images -o /path/to/output
To run benchmarks for all supported libraries and generate a comparison:
./run_all.sh -d /path/to/images -o /path/to/output --update-docs
To benchmark a single library:
./run_video_single.sh -l albumentations -d /path/to/videos -o /path/to/output
To run benchmarks for all supported libraries and generate a comparison:
./run_video_all.sh -d /path/to/videos -o /path/to/output --update-docs
The benchmark methodology is designed to ensure fair and reproducible comparisons:
- Data Loading: Data is loaded using library-specific loaders to ensure optimal format compatibility
- Warmup Phase: Adaptive warmup until performance variance stabilizes
- Measurement Phase: Multiple runs with statistical analysis
- Environment Control: Consistent thread settings and hardware utilization
For detailed methodology, see the specific benchmark READMEs:
Contributions are welcome! If you'd like to add support for a new library, improve the benchmarking methodology, or fix issues, please submit a pull request.
When contributing, please:
- Follow the existing code style
- Add tests for new functionality
- Update documentation as needed
- Ensure all tests pass