autonomousvision
diff --git a/‎README.md‎
Lines changed: 11 additions & 8 deletions b/‎README.md‎
Lines changed: 11 additions & 8 deletions
diff --git a/‎docs/install.md‎
Lines changed: 8 additions & 4 deletions b/‎docs/install.md‎
Lines changed: 8 additions & 4 deletions
diff --git a/‎docs/splits.md‎
Lines changed: 6 additions & 6 deletions b/‎docs/splits.md‎
Lines changed: 6 additions & 6 deletions
diff --git a/‎docs/submission.md‎
Lines changed: 12 additions & 9 deletions b/‎docs/submission.md‎
Lines changed: 12 additions & 9 deletions
diff --git a/‎docs/traffic_agents.md‎
Lines changed: 11 additions & 5 deletions b/‎docs/traffic_agents.md‎
Lines changed: 11 additions & 5 deletions
diff --git a/‎download/download_warmup_synthetic_scenes.sh‎
Lines changed: 0 additions & 3 deletions b/‎download/download_warmup_synthetic_scenes.sh‎
Lines changed: 0 additions & 3 deletions
diff --git a/‎download/download_warmup_two_stage.sh‎
Lines changed: 3 additions & 0 deletions b/‎download/download_warmup_two_stage.sh‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎navsim/common/dataclasses.py‎
Lines changed: 6 additions & 0 deletions b/‎navsim/common/dataclasses.py‎
Lines changed: 6 additions & 0 deletions
@@ -3,14 +3,13 @@
 <p align="center">
   <img src="assets/navsim_transparent.png" width="600">
   <h2 align="center">Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking</h1>
-  <h3 align="center"><a href="https://arxiv.org/abs/2406.15349">Paper</a> | <a href="https://danieldauner.github.io/assets/pdf/Dauner2024NIPS_supplementary.pdf">Supplementary</a> | <a href="https://www.youtube.com/watch?v=Qe76HRmPDe0">Talk</a> | <a href="https://opendrivelab.com/challenge2024/#end_to_end_driving_at_scale">2024 Challenge</a> | <a href="https://huggingface.co/spaces/AGC2024-P/e2e-driving-navsim">Leaderboard v1.1</a> | <a href="https://huggingface.co/spaces/AGC2025/e2e-driving-warmup">Warmup Leaderboard v2.0</a> </h3>
+  <h3 align="center"><a href="https://arxiv.org/abs/2406.15349">Paper</a> | <a href="https://danieldauner.github.io/assets/pdf/Dauner2024NIPS_supplementary.pdf">Supplementary</a> | <a href="https://www.youtube.com/watch?v=Qe76HRmPDe0">Talk</a> | <a href="https://opendrivelab.com/challenge2024/#end_to_end_driving_at_scale">2024 Challenge</a> | <a href="https://huggingface.co/spaces/AGC2024-P/e2e-driving-navsim">Leaderboard v1.1</a> | <a href="https://huggingface.co/spaces/AGC2025/e2e-driving-warmup">Warmup Leaderboard v2.1</a> </h3>
 </p>
 
 </div>
 
 <br/>
 
-
 > [**NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking**](https://arxiv.org/abs/2406.15349)
 >
 > [Daniel Dauner](https://danieldauner.github.io/)<sup>1,2</sup>, [Marcel Hallgarten](https://mh0797.github.io/)<sup>1,5</sup>, [Tianyu Li](https://github.com/sephyli)<sup>3</sup>, [Xinshuo Weng](https://xinshuoweng.com/)<sup>4</sup>, [Zhiyu Huang](https://mczhi.github.io/)<sup>4,6</sup>, [Zetong Yang](https://scholar.google.com/citations?user=oPiZSVYAAAAJ)<sup>3</sup>\
@@ -33,13 +32,13 @@
 </p>
 
 ## Table of Contents
+
 1. [Highlights](#highlight)
 2. [Getting started](#gettingstarted)
 3. [Changelog](#changelog)
 4. [License and citation](#licenseandcitation)
 5. [Other resources](#otherresources)
 
-
 ## Getting started <a name="gettingstarted"></a>
 
 - [Download and installation](docs/install.md)
@@ -52,9 +51,15 @@
 
 <p align="right">(<a href="#top">back to top</a>)</p>
 
-
 ## Changelog <a name="changelog"></a>
-- **`[2025/02/28]`** NAVSIM v2.0 release (official devkit version for 2025 warm-up phase)
+
+- **`[2025/04/09]`** NAVSIM v2.1 release (official devkit version for 2025 warm-up phase)
+  - Added new dataset for the [Hugging Face Warmup leaderboard](https://huggingface.co/spaces/AGC2025/e2e-driving-warmup) (see [submission](docs/submission.md))
+
+    - ⚠️ **IMPORTANT**: To submit to the updated leaderboard, you need to re-download the synthetic dataset (see [download](docs/install.md))
+  - Introduced support for two-stage reactive traffic agents (see [traffic simulation](docs/metrics.md))
+  - Code refactoring and bug fixes
+- **`[2025/02/28]`** NAVSIM v2.0 release
   - Extends the PDM Score with more metrics and penalties (see [metrics](docs/metrics.md))
   - Adds a new two-stage pseudo closed-loop simulation (see [metrics](docs/metrics.md))
   - Adds support for reactive traffic agent policies (see [traffic simulation](docs/metrics.md))
@@ -82,11 +87,10 @@
   - OpenScene-mini sensor blobs and annotation logs
   - Naive `ConstantVelocity` agent
 
-
 <p align="right">(<a href="#top">back to top</a>)</p>
 
-
 ## License and citation <a name="licenseandcitation"></a>
+
 All assets and code in this repository are under the [Apache 2.0 license](./LICENSE) unless specified otherwise. The datasets (including nuPlan and OpenScene) inherit their own distribution licenses. Please consider citing our paper and project if they help your research.
 
 ```BibTeX
@@ -109,7 +113,6 @@ All assets and code in this repository are under the [Apache 2.0 license](./LICE
 
 <p align="right">(<a href="#top">back to top</a>)</p>
 
-
 ## Other resources <a name="otherresources"></a>
 
 - [SLEDGE](https://github.com/autonomousvision/sledge) | [tuPlan garage](https://github.com/autonomousvision/tuplan_garage) | [CARLA garage](https://github.com/autonomousvision/carla_garage) | [Survey on E2EAD](https://github.com/OpenDriveLab/End-to-end-Autonomous-Driving)
 
@@ -34,7 +34,7 @@ You can download the mini, trainval, test, private_test_e2e and warmup_synthetic
 ./download_trainval
 ./download_test
 ./download_private_test_e2e
-./download_warmup_synthetic_scenes
+./download_warmup_two_stage
 ```
 
 Also, the script `./download_navtrain` can be used to download a small portion of the  `trainval` dataset split which is needed for the `navtrain` training split.
@@ -57,12 +57,16 @@ This will download the splits into the download directory. From there, move it t
     |    ├── trainval
     |    ├── private_test_e2e
     |    └── mini
-    └── synthetic_scenes
-         ├── scene_pickles
-         └── sythetic_sensor
+    └── warmup_two_stage
+         ├── openscene_meta_datas
+	 ├── sensor_blobs
+	 ├── synthetic_scene_pickles
+         └── synthetic_scenes_attributes.csv
 
 ```
 
+⚠️ **IMPORTANT:** If you have already downloaded the data for Navsim V2.0.1 and tried the Hugging Face Leaderboard, please replace the old `"synthetic_scenes"` folder with the new `"warmup_two_stage"` folder. In Navsim V2.1, the traffic agents' policy has been updated, and the old data is no longer compatible.
+
 Set the required environment variables, by adding the following to your `~/.bashrc` file
 Based on the structure above, the environment variables need to be defined as:
 
 
@@ -1,7 +1,7 @@
 # Dataset splits vs. filtered training / test splits
 
 The NAVSIM framework utilizes several dataset splits for standardized training and evaluating agents.
-All of them use the OpenScene dataset that is divided into the dataset splits `mini`,`trainval`,`test`,`private_test_e2e`, which can all be downloaded separately.
+All of them use the OpenScene dataset that is divided into the dataset splits `mini`, `trainval`,` test`, `warmup_two_stage`, `private_test_e2e`, which can all be downloaded separately.
 
 It is possible to run trainings and evaluations directly on these sets (see `Standard` in table below).
 Alternatively, you can run trainings and evaluations on training and validation splits that were filtered for challenging scenarios (see `NAVSIM` in table below), which is the recommended option for producing comparable and competitive results efficiently.
@@ -34,7 +34,7 @@ In Navsim-v1.1, the training/test split can bet set with a single config paramet
         <td>trainval</td>
         <td>Large split for training and validating agents with regular driving recordings. Corresponds to nuPlan and downsampled to 2HZ.</td>
         <td>14GB</td>
-        <td>&gt;2000GB</td>
+        <td>>2000GB</td>
         <td>
         train_test_split=trainval
         </td>
@@ -78,18 +78,18 @@ In Navsim-v1.1, the training/test split can bet set with a single config paramet
     </tr>
     <tr>
         <td rowspan="2">Competition</td>
-        <td>warmup_test_e2e</td>
+        <td>warmup_two_stage</td>
         <td>Warmup test split to validate submission on hugging face. Available as a filter for test split.</td>
         <td>-</td>
         <td>-</td>
         <td>
-        train_test_split=warmup_test_e2e
+        train_test_split=warmup_two_stage
         </td>
     </tr>
     <tr>
         <td>private_test_e2e</td>
         <td>Private test split for the challenge leaderboard on hugging face.</td>
-        <td>&lt;1GB</td>
+        <td><1GB</td>
         <td>25GB</td>
         <td>
         train_test_split=private_test_e2e
@@ -105,7 +105,7 @@ The standard splits `trainval`, `test`, and `mini` are from the OpenScene datase
 
 NAVSIM provides a subset and filter of the `trainval` split, called `navtrain`. The `navtrain` split facilitates a standardized training scheme and requires significantly less sensor data storage than `travel` (445GB vs. 2100GB). If your agents don't need historical sensor inputs, you can download `navtrain` without history, which requires 300GB of storage. Note that `navtrain` can be downloaded separately via [download_navtrain.sh](https://github.com/autonomousvision/navsim/blob/main/download/download_navtrain.sh) but still requires access to the `trainval` logs. Similarly, the `navtest` split enables a standardized set for testing agents with a provided scene filter. Both `navtrain` and `navtest` are filtered to increase interesting samples in the sets.
 
-For the challenge on Hugging Face, we provide the `warmup_test_e2e` and `private_test_e2e` for the warm-up and challenge track, respectively. Note that `private_test_e2e` requires you to download the data, while `warmup_test_e2e` is a scene filter for the `mini` split.
+For the challenge on Hugging Face, we provide the `warmup_two_stage` and `private_test_e2e` for the warm-up and challenge track, respectively. Note that `private_test_e2e` requires you to download the data, while `warmup_two_stage` is a scene filter for the `test` split.
 
 ## Troubleshooting
 
 
@@ -7,24 +7,27 @@ For the [NAVSIM challenge 2025 warmup](https://huggingface.co/spaces/AGC2025/e2e
 ### Rules
 
 - **Open-source code and models**:
+
   - We will periodically (~every 6 months) be removing all entries on the leaderboard which **do not provide associated open-source training and inference code with the corresponding pre-trained checkpoints**. Even if removed for not having this information, an entry can be resubmitted once the code needed for reproducibility is made publicly available.
   - Code must be provided by setting the `TEAM_NAME` variable of the submission file as `"<a href=Link/to/repository>Method name</a>"`. Note that this can also be edited on the leaderboard for an existing submission, if the repo is created (or updated) after the initial submission.
-- **Multi-seed submissions**:
-  - Driving policies often differ significantly in performance when re-trained with different network initialization seeds.
-  - Therefore, the leaderboard now supports (1) regular single-seed submissions and (2) multi-seed submission, which we **strongly encourage** (with a minimum of 3 training seeds).
-  - The maximum, mean and standard deviations of our evaluation metrics will be displayed for multi-seed submissions.
+
+### Run Score Locally
+
+You can reproduce your test results locally in the [NAVSIM](https://github.com/autonomousvision/navsim/blob/main/docs/install.md) repository, and they should match the results you obtain on Hugging Face. Follow the steps below:
+
+1. **Download the dataset** — refer to the [dataset](install.md) for instructions.
+2. **Cache the data** — follow the script `scripts/evaluation/run_metric_caching.sh`, and set `TRAIN_TEST_SPLIT=warmup_two_stage`
+3. **Run the evaluation** — follow the script `scripts/evaluation/run_cv_pdm_score_evaluation.sh` with your own model.
+   * If you specified `metric_cache_path` during caching, make sure to use the same path during evaluation.
+   * Set `TRAIN_TEST_SPLIT=warmup_two_stage` to ensure that the score matches the one returned by Hugging Face.
 
 ### Regular warmup submission
 
 To submit to a leaderboard you need to create a pickle file that contains a trajectory for each test scenario. NAVSIM provides a script to create such a pickle file.
 
 Have a look at `run_cv_create_submission_pickle.sh` in the [NAVSIM](https://github.com/autonomousvision/navsim/blob/main/docs/install.md) repository: this file creates the pickle file for the ConstantVelocity agent. You can run it for your own agent by replacing the `agent` override. **Note that you have to set the variables `TEAM_NAME`, `AUTHORS`, `EMAIL`, `INSTITUTION`, and `COUNTRY` in `run_create_submission_pickle.sh` to generate a valid submission file**
 
-You should be able to obtain the same evaluation results as on the server by running the evaluation locally. To do so, use the override `train_test_split=warmup_navsafe_two_stage_extended` when executing the script to run the PDM scoring.
-
-### Multi-seed warmup submission
-
-For a multi-seed submission, you first have to create individual agents, i.e. trained on different seeds. Consequently, you can merge your entries to a single submission file with the `run_merge_submission_pickles.sh` bash script. Please set the override `train_test_split=warmup_navsafe_two_stage_extended ` to ensure all individual entries contain trajectories for the evaluation.
+You should be able to obtain the same evaluation results as on the server by running the evaluation locally.
 
 ## Submission Description
 
 
@@ -5,21 +5,27 @@ NAVSIM v2 introduces support for **reactive traffic agents**, allowing surroundi
 ### Available Traffic Agent Policies
 
 1. **Log-Replay** (Non-Reactive)
-   - Identical to NAVSIM v1, traffic agents strictly follow recorded trajectories without reacting to the ego-vehicle.
 
+   - Identical to NAVSIM v1, traffic agents strictly follow recorded trajectories without reacting to the ego-vehicle.
 2. **Constant-Velocity** (Debugging Only)
+
    - Traffic agents move in a straight line at a fixed velocity, providing a simple baseline for debugging.
+3. **IDM (Intelligent Driver Model)** (Reactive)
 
-3. **IDM (Intelligent Driver Model)**
    - Similar to nuPlan, this model simulates traffic agents with more realistic behavior, adjusting speed and spacing based on road conditions.
    - Pedestrians, static objects, and other non-vehicle agents still follow pre-recorded log data.
 
 ### Selecting a Traffic Agents Policy
-Traffic agent policies can be selected by specifying an override when running the evaluation. For example:
 
-`traffic_agents_policy=navsim_IDM_traffic_agents`
+For single-stage simulation, you can specify the traffic agent policy by providing an override when running the evaluation script `navsim/planning/script/run_pdm_score_one_stage.py`.
+
+An example can be found in the commented section of the script `run_cv_pdm_score_evaluation.sh`. For instance:
+
+`traffic_agents=non_reactive` or `traffic_agents=reactive`
+
+This makes it easy to switch between different traffic agent policies depending on your evaluation requirements.
 
-This allows to easily switch between different policies depending on your evaluation needs.
+In two-stage simulations (e.g., for Hugging Face submissions), reactive traffic agents are used by default.
 
 All available traffic agents policies can be found [here](navsim/planning/script/config/common/traffic_agents_policy.md)
 
 
@@ -0,0 +1,3 @@
+wget https://huggingface.co/datasets/OpenDriveLab/OpenScene/resolve/main/navsim-v2/navsim_v2.1_warmup_two_stage.tar.gz
+tar -xzvf navsim_v2.1_warmup_two_stage.tar.gz
+rm navsim_v2.1_warmup_two_stage.tar.gz
@@ -633,7 +633,13 @@ class SceneFilter:
     log_names: Optional[List[str]] = None
     tokens: Optional[List[str]] = None
     include_synthetic_scenes: bool = False
+    all_mapping: Optional[Dict[Tuple[str, str], List[Tuple[str, str]]]] = None
     synthetic_scene_tokens: Optional[List[str]] = None
+
+    # for reactive and non_reactive
+    reactive_synthetic_initial_tokens: Optional[List[str]] = None
+    non_reactive_synthetic_initial_tokens: Optional[List[str]] = None
+
     # TODO: expand filter options
 
     def __post_init__(self):
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+wget https://huggingface.co/datasets/OpenDriveLab/OpenScene/resolve/main/navsim-v2/navsim_v2.1_warmup_two_stage.tar.gz`
	`2`	`+tar -xzvf navsim_v2.1_warmup_two_stage.tar.gz`
	`3`	`+rm navsim_v2.1_warmup_two_stage.tar.gz`