Merge pull request #22 from mh0797/main

VVeiCao · web-flow · commit 74602a832e30 · 2025-04-13T11:38:27.000+02:00
version2.1.1
diff --git a/README.md b/README.md
@@ -52,13 +52,14 @@
 <p align="right">(<a href="#top">back to top</a>)</p>
 
 ## Changelog <a name="changelog"></a>
+- **`[2025/04/13]`** NAVSIM v2.1.1 release (official devkit version for 2025 warm-up phase)
+  - Updated dataset for the [Hugging Face Warmup leaderboard](https://huggingface.co/spaces/AGC2025/e2e-driving-warmup) with minor fixes
 
-- **`[2025/04/09]`** NAVSIM v2.1 release (official devkit version for 2025 warm-up phase)
-  - Added new dataset for the [Hugging Face Warmup leaderboard](https://huggingface.co/spaces/AGC2025/e2e-driving-warmup) (see [submission](docs/submission.md))
+    - ⚠️ **IMPORTANT**: To submit to the updated leaderboard, you need to re-download the synthetic dataset v2.1.1 (see [download](docs/install.md))
 
-    - ⚠️ **IMPORTANT**: To submit to the updated leaderboard, you need to re-download the synthetic dataset (see [download](docs/install.md))
+- **`[2025/04/08]`** NAVSIM v2.1 release 
+  - Added new dataset for the [Hugging Face Warmup leaderboard](https://huggingface.co/spaces/AGC2025/e2e-driving-warmup) (see [submission](docs/submission.md))
   - Introduced support for two-stage reactive traffic agents (see [traffic simulation](docs/metrics.md))
-  - Code refactoring and bug fixes
 - **`[2025/02/28]`** NAVSIM v2.0 release
   - Extends the PDM Score with more metrics and penalties (see [metrics](docs/metrics.md))
   - Adds a new two-stage pseudo closed-loop simulation (see [metrics](docs/metrics.md))
@@ -67,7 +68,6 @@
   - Leaderboard for `navtest` on [Hugging Face](https://huggingface.co/spaces/AGC2024-P/e2e-driving-navsim)
   - Release of baseline checkpoints on [Hugging Face](https://huggingface.co/autonomousvision/navsim_baselines)
   - Updated docs for [submission](docs/submission.md) and [paper](https://arxiv.org/abs/2406.15349)
-  - Code refactoring, formatting, minor fixes
 - **`[2024/04/21]`** NAVSIM v1.0 release (official devkit version for [AGC 2024](https://opendrivelab.com/challenge2024/#end_to_end_driving_at_scale))
   - Parallelization of metric caching / evaluation
   - Adds [Transfuser](https://arxiv.org/abs/2205.15997) baseline (see [agents](docs/agents.md#Baselines))
diff --git a/docs/agents.md b/docs/agents.md
@@ -76,7 +76,7 @@ You can configure the set of sensor modalities to use and how much history you n
 
 **Why LiDAR?** Recent literature on open-loop planning has opted away from LiDAR in favor of using surround-view high-resolution cameras. This has significantly strained the compute requirements for training and testing SoTA planners. We hope that the availability of the LiDAR modality enables more computationally efficient submissions that use fewer (or low-resolution) camera inputs.
 
-**Ego Status.** Besides the sensor data, an agent also receives the ego pose, velocity and acceleration information in local coordinates. Finally, to disambiguate driver intention, we provide a discrete driving command, indicating whether the intended route is towards the left, straight or right direction. Importantly, the driving command in NAVSIM is based solely on the desired route, and does not entangle information regarding obstacles and traffic signs (as was prevalent on prior benchmarks such as nuScenes). Note that the left and right driving commands cover turns, lane changes and sharp curves.
+**Ego Status.** Besides the sensor data, an agent also receives the ego pose, velocity and acceleration information in local coordinates. Finally, to disambiguate driver intention, we provide a discrete driving command, indicating whether the intended route is towards the left, straight or right direction. There is also a fourth command, representing 'unknown', which can be used to filter out data during training. Importantly, the driving command in NAVSIM is based solely on the desired route, and does not entangle information regarding obstacles and traffic signs (as was prevalent on prior benchmarks such as nuScenes). Note that the left and right driving commands cover turns, lane changes and sharp curves.
 
 ## Output
 
diff --git a/docs/metrics.md b/docs/metrics.md
@@ -62,15 +62,14 @@ The new NAVSIM v2 evaluation uses a two-stage aggregation process to approximate
 
 2. **Second Stage Scoring:**
    - In addition to the initial scene, multiple potential follow-up scenes to this initial scene are included in the test set to be evaluated.
-   - The follow-up scenes were pre-computed by rolling out several simulations starting from the initial scene, each with a different 3-second plan.
+   - The follow-up scenes were pre-computed by rolling out several simulations starting from the initial scene, each with a different 4-second plan.
    - Therefore, each of these follow-up scenes starts from a state different to the endpoint of the initial scene, e.g. with a lateral offset or different speed.
    - We evaluate the submitted planner on each follow-up scene over a fixed horizon (4 seconds) using the EPDMS metric.
-   - For these scenes, we compute the EPDMS with [reactive background traffic](traffic_agents.md) instead of log-replay agents.
 
 3. **Weighting and Aggregation:**
    - To emulate the effects of closed-loop simulation, the relevance of each follow-up scene to the overall score depends on how close its starting position is to where the submitted planner actually ended in the first stage.
    - We assign higher weights to follow-up scenes that start closer to the submitted planner's end position.
-   - Finally, we compute a weighted average of all scores second-stage to produce the aggregated metric.
+   - We first compute a weighted aggregation if all second-stage scores. Finally, we aggregate the scores of the first and second stage to produce the aggregated metric.
 
 # Run an evaluation
 To evaluate the PDM score for an agent you can run:
diff --git a/docs/splits.md b/docs/splits.md
@@ -80,8 +80,8 @@ In Navsim-v1.1, the training/test split can bet set with a single config paramet
         <td rowspan="2">Competition</td>
         <td>warmup_two_stage</td>
         <td>Warmup test split to validate submission on hugging face. Available as a filter for test split.</td>
-        <td>-</td>
-        <td>-</td>
+        <td>27M</td>
+        <td>1.2G</td>
         <td>
         train_test_split=warmup_two_stage
         </td>
diff --git a/docs/submission.md b/docs/submission.md
@@ -1,12 +1,12 @@
 # Submitting to the leaderboard
 
-NAVSIM comes with an official leaderboard ([Leaderboard 2024](https://huggingface.co/spaces/AGC2024-P/e2e-driving-navsim), [Leaderboard 2025 Warmup](https://huggingface.co/spaces/AGC2025/e2e-driving-warmup)) on HuggingFace. The leaderboard prevents ambiguity in metric definitions between different projects, as all evaluation is performed on the server with the official evaluation script.
+NAVSIM comes with official leaderboards ([Leaderboard 2024](https://huggingface.co/spaces/AGC2024-P/e2e-driving-navsim), [Leaderboard 2025 Warmup](https://huggingface.co/spaces/AGC2025/e2e-driving-warmup)) on HuggingFace. The leaderboards prevent ambiguity in metric definitions between different projects, as all evaluation is performed on the server with the official evaluation script.
 
-For the [NAVSIM challenge 2025 warmup](https://huggingface.co/spaces/AGC2025/e2e-driving-warmup), we now open the leaderboard with the `warmup_navsafe_two_stage_extended` split. In this guide, we describe how to create a valid submission and what rules apply for the new leaderboard.
+For the [NAVSIM challenge 2025 warmup](https://huggingface.co/spaces/AGC2025/e2e-driving-warmup), we now open the leaderboard with the `warmup_two_stage` split. In this guide, we describe how to create a valid submission.
 
 ### Rules
-
-- **Open-source code and models**:
+- [**General rules (for Leaderboard 2025)**](https://opendrivelab.com/challenge2025/#rule)
+- **Open-source code and models (for Leaderboard 2024)**:
 
   - We will periodically (~every 6 months) be removing all entries on the leaderboard which **do not provide associated open-source training and inference code with the corresponding pre-trained checkpoints**. Even if removed for not having this information, an entry can be resubmitted once the code needed for reproducibility is made publicly available.
   - Code must be provided by setting the `TEAM_NAME` variable of the submission file as `"<a href=Link/to/repository>Method name</a>"`. Note that this can also be edited on the leaderboard for an existing submission, if the repo is created (or updated) after the initial submission.
diff --git a/download/download_warmup_two_stage.sh b/download/download_warmup_two_stage.sh
@@ -1,3 +1,3 @@
-wget https://huggingface.co/datasets/OpenDriveLab/OpenScene/resolve/main/navsim-v2/navsim_v2.1_warmup_two_stage.tar.gz
-tar -xzvf navsim_v2.1_warmup_two_stage.tar.gz
-rm navsim_v2.1_warmup_two_stage.tar.gz
+wget https://huggingface.co/datasets/OpenDriveLab/OpenScene/resolve/main/navsim-v2/navsim_v2.1.1_warmup_two_stage.tar.gz
+tar -xzvf navsim_v2.1.1_warmup_two_stage.tar.gz
+rm navsim_v2.1.1_warmup_two_stage.tar.gz
diff --git a/navsim/planning/script/config/common/train_test_split/scene_filter/warmup_two_stage.yaml b/navsim/planning/script/config/common/train_test_split/scene_filter/warmup_two_stage.yaml
diff --git a/navsim/planning/script/config/common/train_test_split/warmup_two_stage.yaml b/navsim/planning/script/config/common/train_test_split/warmup_two_stage.yaml