- Docker — Required for the Dazel (Bazel-in-Docker) build system
- Linux: Docker Engine
- macOS: Colima with QEMU (see macOS Setup for more information)
- NGC API Key — Required to pull the prebuilt docker images for the checkers
- Click links for create or login to an NGC account, and create an API key.
- Docker login to nvcr.io by following the steps found in create an API key.
- NVIDIA GPU with CUDA (obstacle check only) — Required for the Obstacle check's segmentation models. CPU mode is available but significantly slower.
- Git — To clone the repository
- Git LFS — Required if you use the repo’s sample data under
checks/sample_data/. Those files are stored as LFS pointers; withoutgit lfs pull, checkers see 133-byte pointer files and fail with opaque errors (e.g. FFmpeg “moov atom not found”). - Storage Provider Credentials — Checkers require a connection to a storage provider (S3 is recommended) to store video and data inputs. Ensure you have an S3 bucket set up and have credentials with read access to S3.
- Endpoints for LLM and VLM — The VLM checker and attribute verification checker require connections to a deployed LLM and VLM.
- See the Setting up your LLM and VLM API key section to get free, limited access to these endpoints for testing through build.nvidia.com.
- See the Deploy Cosmos Reason 2 NIM to deploy Cosmos Reason 2 or any other VLM on your own infrastructure and leverage these endpoints.
git clone https://github.com/nvidia-cosmos/cosmos-evaluator.git
cd cosmos-evaluatorIf you plan to use the sample data in checks/sample_data/, pull the actual files with Git LFS:
git lfs install
git lfs pullWithout this step, files in that directory are LFS pointers and checks will fail (e.g. “moov atom not found” when processing video).
| Requirement | Minimum Version | Notes |
|---|---|---|
| Python 3 | 3.5+ | Must be available as python3 |
| Docker | 28.1.1+ | Docker Engine on Linux, Colima on macOS |
| Git | 2.49.0+ | For cloning and LFS support |
| Git LFS | 3.0+ | Required for checks/sample_data/ (video and other binaries) |
| NVIDIA GPU Driver | 570.124.06+ | Required for GPU-accelerated checks (Obstacle) |
| NVIDIA Container Toolkit | 1.16.2+ | Required for --gpus flag in Docker |
GPU-related requirements are only needed for checks that use GPU inference (e.g., Obstacle). The VLM, Hallucination, and Attribute Verification checks can run without a GPU.
Cosmos Evaluator uses Dazel (Bazel-in-Docker) for hermetic, reproducible builds. All build commands (dazel build, dazel test, dazel run) execute Bazel inside a Docker container, so you don't need to install Bazel locally.
To set up the build environment, source the setup script from the repo root:
. build/env_setup.shThis script:
- Validates that all required tools are installed and meet minimum version requirements
- Registers the
dazelcommand in your shell (with tab completion for bash/zsh) - Sets the
COSMOS_EVALUATOR_ROOTenvironment variable
You need to re-source this script in each new terminal session. Verify the setup by running:
dazel infoFor macOS-specific setup (Colima, user.dazelrc), see the Customization Guide.
The Obstacle check requires the CitySemsegFormer ONNX model for semantic segmentation. Download it from the NGC Catalog and place it in the repo:
unzip deployable_onnx_v1.0.zip -d /tmp/citysemsegformer
cp /tmp/citysemsegformer/*.onnx checks/utils/citysemsegformer.onnxWithout this file, the Obstacle check's dynamic processor will raise a FileNotFoundError at runtime. All other checks and services are unaffected.
Build all targets in the repo:
dazel build //...Run the full test suite:
dazel test //...dazel is a drop-in replacement for bazel — any bazel command you see in documentation or tutorials works the same way by substituting dazel. For more information on available commands and target syntax, see the Bazel documentation.
This section contains the instructions to configure and run the prebuilt docker images for each of the checkers in the project.
All services provide a Swagger UI at /docs (e.g., http://localhost:8000/docs) for browsing the API documentation and testing endpoints from a browser.
The Hallucination check requires the NVIDIA Multistorage configuration to be set. Multistorage is used to connect the checker to various different storage backends through a single, unified configuration.
To set up the environment variable, first copy the .env.example template from the hallucination check:
cp checks/hallucination/.env.example checks/hallucination/.envthen update the file with your Multistorage configuration. See the Multistorage Configuration Example section for an example of how this can be done for an S3 bucket, or the Official Multistorage Documentation for the full configuration reference.
When setting the Multistorage environment variable in the ENV file, compact the JSON onto the same line to avoid parsing issues. This ENV should also be wrapped in single quotes (e.g. '{/* your config json here */}').
docker run --env-file=checks/hallucination/.env -p 8080:8080 nvcr.io/nvidia/cosmos/hallucination-checker:1.0.0Verify that the service is running:
curl http://localhost:8080/healthNote: Before running the hallucination checker, one must upload their input to a storage provider, and add Multistorage configuration
curl -X POST "http://localhost:8080/process" \
-H "Content-Type: application/json" \
-d '{
"clip_id": "my_clip",
"original_video_path": "s3://<bucket-name>/path/to/original.mp4",
"augmented_video_path": "s3://<bucket-name>/path/to/augmented.mp4"
}'
This check has no prebuilt container. It requires a local build via
dazel, a separate ONNX model download from NGC, and an NVIDIA GPU driver >=570. See the steps below before attempting to run.
Note: The Obstacle check's dynamic processor requires the CitySemsegFormer ONNX model so it is not available as a prebuilt container. Download the
deployable_onnx_v1.0zip file from the NGC Catalog, extract the.onnxfile, and save it aschecks/utils/citysemsegformer.onnx.
See the Obstacle Check docs for more details of building and running this container.
The VLM check also uses the StorageProvider framework to download the input video from the URL provided in the /process/preset request. Storage is configured through the following environment variables:
| Variable | Required | Default | Description |
|---|---|---|---|
COSMOS_EVALUATOR_STORAGE_TYPE |
No | s3 |
Storage backend: s3 or local |
COSMOS_EVALUATOR_STORAGE_BUCKET |
Yes (S3) | — | S3 bucket name |
COSMOS_EVALUATOR_STORAGE_REGION |
Yes (S3) | — | AWS region (also reads AWS_DEFAULT_REGION) |
COSMOS_EVALUATOR_STORAGE_ACCESS_KEY |
No | — | AWS access key (also reads AWS_ACCESS_KEY_ID) |
COSMOS_EVALUATOR_STORAGE_SECRET_KEY |
No | — | AWS secret key (also reads AWS_SECRET_ACCESS_KEY) |
The storage type determines how the augmented_video_url in the request is resolved. With s3, it can be an s3:// URI, a presigned URL, or any HTTPS URL. With local, it is the mounted filesystem path (e.g. /data/video.mp4).
If you already have AWS credentials in a ~/.aws/.env file, the storage settings will also read from AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_DEFAULT_REGION as fallbacks if passed via --env-file below.
The VLM check requires access to a Vision-Language Model endpoint. See Setting up your LLM and VLM API key for how to generate a key that works for inference.
Add ~/.cosmos_evaluator/.env with your API key environment variable:
mkdir -p ~/.cosmos_evaluator
cat > ~/.cosmos_evaluator/.env << 'EOF'
BUILD_NVIDIA_API_KEY=your_api_key_here
EOFSee the VLM Preset Check guide for details on configuring endpoints and setting up custom VLM providers.
The commands below use the repo sample data (run git lfs pull first). To use your own videos, mount your directory into the container with -v.
docker run --env-file ~/.cosmos_evaluator/.env \
-e COSMOS_EVALUATOR_STORAGE_TYPE=local \
-v $(pwd)/checks/sample_data/cosmos_public:/data \
-p 8000:8000 \
nvcr.io/nvidia/cosmos/vlm:1.15.0Verify that the service is running:
curl http://localhost:8000/healthcurl -X POST http://localhost:8000/process/preset \
-H "Content-Type: application/json" \
-d '{
"augmented_video_url": "/data/01ce78ad-9e9a-4df9-95d1-1d50e41a04ce_764657799000_764677799000_0_Morning.30fps.mp4",
"preset_conditions": {
"name": "environment",
"weather": "Clear Sky",
"time_of_day_illumination": "Morning",
"region_geography": "Dense City Center",
"road_surface_conditions": "Dry"
}
}'The attribute verification check requires NVIDIA Multistorage configuration to be set, and also requires an API key to be set for the VLM and LLM connections.
To set up the environment variable, first copy the .env.example template from the attribute verification check:
cp checks/attribute_verification/.env.example checks/attribute_verification/.envthen update the file with your Multistorage configuration and API key.
To set the Multistorage configuration, see the Multistorage Configuration Example section for an example of how this can be done for an S3 bucket, or the Official Multistorage Documentation for the full configuration reference.
When setting the Multistorage environment variable in the ENV file, compact the JSON onto the same line to avoid parsing issues. This ENV should also be wrapped in single quotes (e.g. '{/* your config json here */}').
To set the API key, see Setting up your LLM and VLM API key. The same key is shared between LLM and VLM calls. You can also use an API key for any endpoint with an OpenAI-compatible interface. If an API key is not needed, this variable can be left empty or unset.
The LLM and VLM endpoint and model can be configured in the default config for the services, or overridden in the checker /process request. See LLM and VLM model setup for more information on setting the LLM and VLM endpoints and model.
docker run --env-file=checks/attribute_verification/.env -p 8080:8080 nvcr.io/nvidia/cosmos/attribute-verification-checker:1.0.0Verify that the service is running:
curl http://localhost:8080/healthNote: Before running the attribute verification checker, one must upload their input to a storage provider, and add Multistorage configuration.
# Verify whether the augmented video shows a scene with a sunny morning, by checking attributes for weather and time of day
curl -X POST "http://localhost:8080/process" \
-H "Content-Type: application/json" \
-d '{
"clip_id": "my_clip",
"augmented_video_path": "s3://<bucket-name>/path/to/augmented.mp4",
"config": {
"selected_variables": {
"weather": "sunny",
"time_of_day": "morning"
},
"variable_options": {
"weather": [
"sunny",
"cloudy",
"rainy",
"snowy"
],
"time_of_day": [
"morning",
"night"
]
},
"question_generation": {
"llm": {
"endpoint": "https://integrate.api.nvidia.com/v1",
"model": "qwen/qwen3.5-397b-a17b"
}
},
"vlm_verification": {
"vlm": {
"endpoint": "https://integrate.api.nvidia.com/v1",
"model": "qwen/qwen3.5-397b-a17b"
}
}
}
}'
- Visit the Deployment Guide for instructions on building, tagging, pushing, and running containers in production.
- Visit the Customization Guide for instructions on getting started with local development.
To create a multistorage configuration for an S3 bucket, you will need:
- Bucket name
- AWS access key ID
- AWS secret access key
The Multistorage configuration should be constructed like this:
{
"profiles":{
"<bucket-name>":{
"storage_provider":{
"type":"s3",
"options":{
"base_path":"<bucket-name>"
}
},
"credentials_provider":{
"type":"S3Credentials",
"options":{
"access_key":"<access-key>",
"secret_key":"<secret-key>"
}
}
}
},
"path_mapping":{
"s3://<bucket-name>/":"msc://<bucket-name>/"
}
}When setting the Multistorage environment variable in the ENV file, compact the JSON onto the same line to avoid parsing issues. This ENV should also be wrapped in single quotes (e.g. '{/* your config json here */}').
All checkers use OpenAI-compatible endpoints for the LLM and VLM endpoints. By default, the checkers will point to build.nvidia.com for these endpoints. While this is a great way to get started with the checkers, it is not recommended to use the build.nvidia.com endpoints for production use. For production, please either download the LLM and VLM NIMs and run them locally, or connect to a stable external endpoint of your choice.
Important: The key must be generated from the specific model's page, not from your NGC account settings. A key from account settings will authenticate to
GET /v1/models(health checks pass) but return 403 on inference calls (POST /v1/chat/completions).
To generate a key:
- Go to the model page on build.nvidia.com (e.g. Qwen 3.5-397B)
- Click View Code, then click Generate API Key
- Copy the key (it starts with
nvapi-) and set it asBUILD_NVIDIA_API_KEY
Verify the key works for inference (not just model listing):
curl -s -o /dev/null -w "%{http_code}" \
-X POST https://integrate.api.nvidia.com/v1/chat/completions \
-H "Authorization: Bearer $BUILD_NVIDIA_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "qwen/qwen3.5-397b-a17b", "messages": [{"role": "user", "content": "say hello"}], "max_tokens": 10}'A 200 response confirms the key has inference entitlement. A 403 means the key was generated from account settings — regenerate it from the model page.