A wrapper around Ultralytics' YOLO library, used for object detection. Results are output in a custom JSON format.
Installation requires poetry.
To install poetry, use pipx:
sudo apt install pipx
pipx install poetryTo install yolo-sod, from this directory run:
poetry install This will install all the requirements in their own virtual environment. To
enter the environment, from this directory run poetry shell.
To export a model run
yolo-sod-exportcommand from withing the poetry environment. Without any arguments this will
build an INT8 precision TensorRT engine for the large YOLOv11 model. This will
download the necessary files and use the VOC2007 dataset from /import/datasets
for calibration.
Model files will be saved in the current working directory.
Various configuration options can also be set. These are listed with:
yolo-sod-export --helpNote that by default a 2 GiB workspace is used for TensorRT. This program takes a large amount of GPU memory during the calibration and building process, so setting a larger value may lead to memory allocation errors. For the large YOLOv11 model a workspace of 2GiB requires just under 8GiB of GPU memory to do the calibration.
To perform inference on a video run
yolo-sod --input <path_to_video_file> --output <path_to_json_file> --model-od <path_to_OD_model>from within the poetry shell. This will decode the video using Nvidia's NvDec
GPU decoder and perform object detection using the model. The detections will be
written to JSON output of the form:
{
"label": {
"timestamp1": [
[[x0, y0, x1, y1], "category", confidence],
[[x0, y0, x1, y1], "category", confidence],
...
],
"timestamp2": [
[[x0, y0, x1, y1], "category", confidence],
[[x0, y0, x1, y1], "category", confidence],
...
],
...
}
}Supported video codecs:
av1h264hevcmjpegmpeg1videompeg2videompeg4videovc1vp8vp9
Additional inference options can be found by running
yolo-sod --helpTo build packages (source and wheel) run
poetry buildfrom within this directory. The packages will be produced in the ./dist
directory.