Video Augmentation Benchmarks

This directory contains benchmark results for video augmentation libraries.

Overview

The video benchmarks measure the performance of various augmentation libraries on video transformations. The benchmarks compare CPU-based processing (Albumentations) with GPU-accelerated processing (Kornia).

Dataset

The benchmarks use the UCF101 dataset, which contains 13,320 videos from 101 action categories. The videos are realistic, collected from YouTube, and include a wide variety of camera motion, object appearance, pose, scale, viewpoint, and background. This makes it an excellent dataset for benchmarking video augmentation performance across diverse real-world scenarios.

You can download the dataset from: https://www.crcv.ucf.edu/data/UCF101/UCF101.rar

Methodology

Video Loading: Videos are loaded using library-specific loaders:
- OpenCV for Albumentations
- PyTorch tensors for Kornia
Warmup Phase:
- Performs adaptive warmup until performance variance stabilizes
- Uses configurable parameters for stability detection
- Implements early stopping for slow transforms
Measurement Phase:
- Multiple runs of each transform
- Measures throughput (videos/second)
- Calculates statistical metrics (median, standard deviation)
Environment Control:
- CPU benchmarks are run single-threaded
- GPU benchmarks utilize the specified GPU device
- Thread settings are controlled for consistent results

Hardware Comparison

The benchmarks compare:

Albumentations: CPU-based processing (single thread)
Kornia: GPU-accelerated processing (NVIDIA GPUs)

This provides insights into the trade-offs between CPU and GPU processing for video augmentation.

Running the Benchmarks

To run the video benchmarks:

./run_video_single.sh -l albumentations -d /path/to/videos -o /path/to/output

To run all libraries and generate a comparison:

./run_video_all.sh -d /path/to/videos -o /path/to/output

Benchmark Results

Video Benchmark Results

Number shows how many videos per second can be processed. Larger is better. The Speedup column shows how many times faster Albumentations is compared to the fastest other library for each transform.

Transform	albumentations (videos per second) arm (1 core)	kornia (videos per second) NVIDIA GeForce RTX 4090	torchvision (videos per second) NVIDIA GeForce RTX 4090	Speedup (Alb/fastest other)
Affine	4.45 ± 0.06	21.39 ± 0.05	452.58 ± 0.14	0.01x
AutoContrast	20.85 ± 0.10	21.41 ± 0.02	577.72 ± 16.86	0.04x
Blur	49.61 ± 1.95	20.61 ± 0.06	N/A	2.41x
Brightness	56.84 ± 1.94	21.85 ± 0.02	755.52 ± 435.17	0.08x
CLAHE	8.89 ± 0.09	N/A	N/A	N/A
CenterCrop128	733.66 ± 4.03	70.12 ± 1.29	1133.39 ± 234.60	0.65x
ChannelDropout	58.28 ± 2.96	21.81 ± 0.03	N/A	2.67x
ChannelShuffle	46.92 ± 2.29	19.99 ± 0.03	958.35 ± 0.20	0.05x
CoarseDropout	65.62 ± 1.82	N/A	N/A	N/A
ColorJitter	10.67 ± 0.23	18.79 ± 0.03	68.75 ± 0.13	0.16x
Contrast	58.81 ± 1.10	21.69 ± 0.04	546.55 ± 13.23	0.11x
CornerIllumination	4.80 ± 0.47	2.60 ± 0.07	N/A	1.84x
Elastic	4.31 ± 0.07	N/A	126.83 ± 1.28	0.03x
Equalize	13.09 ± 0.22	4.21 ± 0.00	191.55 ± 1.25	0.07x
Erasing	69.44 ± 3.31	N/A	254.59 ± 6.57	0.27x
GaussianBlur	25.63 ± 0.42	21.61 ± 0.05	543.44 ± 11.50	0.05x
GaussianIllumination	7.10 ± 0.15	20.33 ± 0.08	N/A	0.35x
GaussianNoise	8.40 ± 0.19	22.38 ± 0.08	N/A	0.38x
Grayscale	152.01 ± 11.18	22.24 ± 0.04	838.40 ± 466.76	0.18x
HSV	6.48 ± 0.35	N/A	N/A	N/A
HorizontalFlip	8.69 ± 0.21	21.86 ± 0.07	977.87 ± 49.03	0.01x
Hue	14.47 ± 0.33	19.53 ± 0.02	N/A	0.74x
Invert	67.77 ± 2.60	21.91 ± 0.23	843.27 ± 176.00	0.08x
JpegCompression	19.62 ± 0.20	N/A	N/A	N/A
LinearIllumination	4.81 ± 0.25	4.29 ± 0.19	N/A	1.12x
MedianBlur	13.87 ± 0.33	8.39 ± 0.09	N/A	1.65x
MotionBlur	33.49 ± 0.66	N/A	N/A	N/A
Normalize	21.70 ± 0.18	21.82 ± 0.02	460.80 ± 0.18	0.05x
OpticalDistortion	4.29 ± 0.10	N/A	N/A	N/A
Pad	68.10 ± 0.91	N/A	759.68 ± 337.78	0.09x
Perspective	4.37 ± 0.08	N/A	434.75 ± 0.14	0.01x
PlankianJitter	21.29 ± 0.67	10.85 ± 0.01	N/A	1.96x
PlasmaBrightness	3.37 ± 0.03	16.94 ± 0.36	N/A	0.20x
PlasmaContrast	2.64 ± 0.01	16.97 ± 0.03	N/A	0.16x
PlasmaShadow	6.08 ± 0.05	19.03 ± 0.50	N/A	0.32x
Posterize	56.50 ± 2.44	N/A	631.46 ± 14.74	0.09x
RGBShift	31.73 ± 0.71	22.27 ± 0.04	N/A	1.42x
Rain	23.09 ± 1.52	3.77 ± 0.00	N/A	6.12x
RandomCrop128	695.33 ± 29.37	65.33 ± 0.35	1132.79 ± 15.23	0.61x
RandomGamma	183.49 ± 6.45	21.63 ± 0.02	N/A	8.48x
RandomResizedCrop	15.48 ± 1.12	6.29 ± 0.03	182.09 ± 15.75	0.09x
Resize	15.67 ± 0.49	5.87 ± 0.03	139.96 ± 35.04	0.11x
Rotate	28.62 ± 0.76	21.53 ± 0.05	534.18 ± 0.16	0.05x
SaltAndPepper	9.88 ± 0.19	8.82 ± 0.12	N/A	1.12x
Saturation	8.42 ± 0.14	36.56 ± 0.12	N/A	0.23x
Sharpen	25.02 ± 0.30	17.86 ± 0.03	420.09 ± 8.99	0.06x
Shear	4.41 ± 0.08	N/A	N/A	N/A
Snow	12.72 ± 0.21	N/A	N/A	N/A
Solarize	52.02 ± 1.45	20.73 ± 0.02	628.42 ± 5.91	0.08x
ThinPlateSpline	4.30 ± 0.14	44.90 ± 0.67	N/A	0.10x
VerticalFlip	9.57 ± 0.27	21.96 ± 0.24	977.92 ± 5.22	0.01x

Torchvision Metadata

system_info:
  python_version: 3.12.9 | packaged by Anaconda, Inc. | (main, Feb  6 2025, 18:56:27) [GCC 11.2.0]
  platform: Linux-5.15.0-131-generic-x86_64-with-glibc2.31
  processor: x86_64
  cpu_count: 64
  timestamp: 2025-03-11T11:14:57.765540+00:00
library_versions:
  torchvision: 0.21.0
  numpy: 2.2.3
  pillow: 11.1.0
  opencv-python-headless: not installed
  torch: 2.6.0
  opencv-python: not installed
thread_settings:
  environment: {'OMP_NUM_THREADS': '1', 'OPENBLAS_NUM_THREADS': '1', 'MKL_NUM_THREADS': '1', 'VECLIB_MAXIMUM_THREADS': '1', 'NUMEXPR_NUM_THREADS': '1'}
  opencv: not installed
  pytorch: {'threads': 32, 'gpu_available': True, 'gpu_device': 0, 'gpu_name': 'NVIDIA GeForce RTX 4090', 'gpu_memory_total': 23.55084228515625, 'gpu_memory_allocated': 15.05643081665039}
  pillow: {'threads': 'unknown', 'simd': False}
benchmark_params:
  num_videos: 200
  num_runs: 10
  max_warmup_iterations: 100
  warmup_window: 5
  warmup_threshold: 0.05
  min_warmup_windows: 3
precision: torch.float16

Albumentations Metadata

system_info:
  python_version: 3.12.8 | packaged by Anaconda, Inc. | (main, Dec 11 2024, 10:37:40) [Clang 14.0.6 ]
  platform: macOS-15.1-arm64-arm-64bit
  processor: arm
  cpu_count: 16
  timestamp: 2025-03-11T01:57:36.320659+00:00
library_versions:
  albumentations: 2.0.5
  numpy: 2.2.3
  pillow: 11.1.0
  opencv-python-headless: 4.11.0.86
  torch: 2.6.0
  opencv-python: not installed
thread_settings:
  environment: {'OMP_NUM_THREADS': '1', 'OPENBLAS_NUM_THREADS': '1', 'MKL_NUM_THREADS': '1', 'VECLIB_MAXIMUM_THREADS': '1', 'NUMEXPR_NUM_THREADS': '1'}
  opencv: {'threads': 1, 'opencl': False}
  pytorch: {'threads': 1, 'gpu_available': False, 'gpu_device': None}
  pillow: {'threads': 'unknown', 'simd': False}
benchmark_params:
  num_videos: 200
  num_runs: 5
  max_warmup_iterations: 100
  warmup_window: 5
  warmup_threshold: 0.05
  min_warmup_windows: 3

Kornia Metadata

system_info:
  python_version: 3.12.9 | packaged by Anaconda, Inc. | (main, Feb  6 2025, 18:56:27) [GCC 11.2.0]
  platform: Linux-5.15.0-131-generic-x86_64-with-glibc2.31
  processor: x86_64
  cpu_count: 64
  timestamp: 2025-03-11T00:46:14.791885+00:00
library_versions:
  kornia: 0.8.0
  numpy: 2.2.3
  pillow: 11.1.0
  opencv-python-headless: not installed
  torch: 2.6.0
  opencv-python: not installed
thread_settings:
  environment: {'OMP_NUM_THREADS': '1', 'OPENBLAS_NUM_THREADS': '1', 'MKL_NUM_THREADS': '1', 'VECLIB_MAXIMUM_THREADS': '1', 'NUMEXPR_NUM_THREADS': '1'}
  opencv: not installed
  pytorch: {'threads': 32, 'gpu_available': True, 'gpu_device': 0, 'gpu_name': 'NVIDIA GeForce RTX 4090', 'gpu_memory_total': 23.55084228515625, 'gpu_memory_allocated': 15.05643081665039}
  pillow: {'threads': 'unknown', 'simd': False}
benchmark_params:
  num_videos: 200
  num_runs: 5
  max_warmup_iterations: 100
  warmup_window: 5
  warmup_threshold: 0.05
  min_warmup_windows: 3
precision: torch.float16

Analysis

The benchmark results show interesting trade-offs between CPU and GPU processing:

CPU Advantages:
- Better for simple transformations with low computational complexity
- No data transfer overhead between CPU and GPU
- More consistent performance across different transform types
GPU Advantages:
- Significantly faster for complex transformations
- Better scaling with video resolution
- More efficient for batch processing

Recommendations

Based on the benchmark results, we recommend:

For simple transformations on a small number of videos, CPU processing may be sufficient
For complex transformations or batch processing, GPU acceleration provides significant benefits
Consider the specific transformations you need and their relative performance on CPU vs GPU

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Video Augmentation Benchmarks

Overview

Dataset

Methodology

Hardware Comparison

Running the Benchmarks

Benchmark Results

Video Benchmark Results

Torchvision Metadata

Albumentations Metadata

Kornia Metadata

Analysis

Recommendations

Files

README.md

Latest commit

History

README.md

File metadata and controls

Video Augmentation Benchmarks

Overview

Dataset

Methodology

Hardware Comparison

Running the Benchmarks

Benchmark Results

Video Benchmark Results

Torchvision Metadata

Albumentations Metadata

Kornia Metadata

Analysis

Recommendations