Add Cosmos3-Nano LIBERO-10 action-policy SFT recipe, config, eval harness, and doc by fwd4 · Pull Request #61 · NVIDIA/cosmos-framework

fwd4 · 2026-06-26T12:59:55Z

What

Adds the Cosmos3-Nano LIBERO-10 action-policy SFT surface, mirroring the existing DROID counterpart (action_policy_droid_nano + toml + launcher + doc).

Feature (net-new)

Experiment configs action_policy_libero_nano (libero_10-only) and action_policy_libero_all_nano (equal 4-suite mix) — gen + action heads from the public Cosmos3-Nano base.
Dataset LIBEROLeRobotDataset + get_action_libero_sft_dataset — frame_wise_relative rot6d, quantile_rot, concat_view (third-person + wrist), 20 fps.
- base_dataset tasks.parquet fallback for community LIBERO layouts.
- Resample-on-decode-failure guard so one undecodable packed-mp4 frame can't crash a multi-node run (matches i4 behavior).
Closed-loop eval harness with vectorized sim, batched /predict_batch, single-rank no_dist checkpoint load.
Structured-prompt serving in the policy server (--format-prompt-as-json), so eval matches the training prompt format; the recipe defaults to it.

Recipe + doc — two presets (to match the Cosmos3 LIBERO-10 result)

Both lr 5e-5, warmup 500, cycle 16000, global batch 2048 (HSDP 2x8):

(A) libero_10-only — action_policy_libero_repro.toml + launch_sft_action_policy_libero.sh (max_iter 2000).
(B) libero-all (4-suite equal mix) — action_policy_libero_all_repro.toml + launch_sft_action_policy_libero_all.sh (max_iter 5000; LIBERO_ROOT = LIBERO_LeRobot_v3 parent dir).
docs/action_policy_libero_sft.md documents both.

Notes

Scoped to LIBERO only; broader action-dataloader/model changes are intentionally not included here.
Based on main.

🤖 Generated with Claude Code

…ness, and doc Mirrors the DROID action-policy counterpart (action_policy_droid_nano + repro toml + launch + doc). Net-new LIBERO feature: - experiment config: action_policy_libero_nano - dataset: LIBEROLeRobotDataset + get_action_libero_sft_dataset (frame_wise_relative rot6d, quantile_rot, concat_view, 20fps); base_dataset tasks.parquet fallback for community LIBERO layouts; resample-on-decode-failure guard (matches i4 behavior) - closed-loop eval harness (vectorized sim) + batched /predict_batch inference path + single-rank no_dist checkpoint load for the policy server - canonical recipe action_policy_libero_repro.toml + launch_sft_action_policy_libero.sh (lr 5e-5, warmup 500, cycle 16000, global batch 2048; ~95% libero_10 500-ep eval) - docs/action_policy_libero_sft.md Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Lean the toml/config/launch/doc comments (drop SR numbers and experimental detail), and set the canonical recipe to HSDP 2x8 with grad_accum=1 (global batch 2048) instead of single-node grad_accum=2.

- action_sft_dataset.py: rebuild as origin/main + libero-only (drop the speedup-era ShardedDROIDLeRobotDataset import that broke config load on a clean main). - remove dataset_reply_action_server.py (GT-replay debug tool, not part of the recipe). - drop DROID/LoRA references from libero docstrings/comments/doc/launch.

…etions); EGL setup optional in doc

…aunch -> gbs 2048)

…h synced

…ntion

…on training Reuse ActionPromptJsonFormatter at serve time so checkpoints trained with format_prompt_as_json=True receive the same structured JSON prompt at eval. Adds a --format-prompt-as-json CLI override (None reads the experiment config). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Plumb format_prompt_as_json through get_action_libero_sft_dataset to the ActionTransformPipeline, and default the action_policy_libero_nano recipe to structured JSON prompts (set False for plain text). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Equal 1:1:1:1 mix over libero_spatial/object/goal/10 on the public Cosmos3-Nano base with JSON prompts; LIBERO_ROOT is the LIBERO_LeRobot_v3 parent dir. Same recipe as action_policy_libero_nano otherwise. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

… (B) Both lr 5e-5 / warmup 500 / cycle 16000 / gbs 2048 / JSON prompts. Preset A (libero_10-only, max_iter 2000) peaks ~1500. Preset B (libero-all equal 4-suite, max_iter 5000) is coverage-limited but reaches the best libero_10 SR (~95.6%). Adds the libero-all TOML + launcher; doc documents both. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Fixes pre-commit check-shebang-scripts-are-executable. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…policy-sft # Conflicts: # cosmos_framework/data/vfm/action/datasets/__init__.py # cosmos_framework/data/vfm/action/datasets/base_dataset.py # cosmos_framework/data/vfm/action/normalizer_stats/libero_native_frame_wise_relative_rot6d.json

Merged origin/main (i4 dataset port renamed stats/ -> normalizer_stats/ and rewrote base_dataset). Point libero's normalizer dir + doc at normalizer_stats/; ruff import cleanup; realign doc table. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

fwd4 and others added 9 commits June 26, 2026 20:59

libero(doc): align markdown tables (rumdl-fmt / MD060)

3ecd0a7

libero: trim recipe/doc comments to essentials; HSDP 2x8 ga1 canonical

6b22dd5

Lean the toml/config/launch/doc comments (drop SR numbers and experimental detail), and set the canonical recipe to HSDP 2x8 with grad_accum=1 (global batch 2048) instead of single-node grad_accum=2.

libero: model_loader = origin/main + no_dist only (drop unrelated del…

dd78c68

…etions); EGL setup optional in doc

libero: canonical recipe = HSDP 8x8 (replicate 8, max_samples 32 in l…

5f1847e

…aunch -> gbs 2048)

libero: recipe = minimum HSDP 2x8 (gbs 2048, grad_accum 1); doc/launc…

21d34ca

…h synced

libero: move lower-mem caveat to Heads-up section; drop all-suites me…

82a5a84

…ntion

libero: lint launch headers (drop GPU counts), drop sweep mention

4d351dd

fwd4 force-pushed the haolia/libero-action-policy-sft branch from bd18351 to 4d351dd Compare June 29, 2026 03:18

xlu451 previously approved these changes Jun 30, 2026

View reviewed changes

fwd4 and others added 5 commits July 1, 2026 08:47

libero: ruff format + import organization

bf8c523

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

fwd4 dismissed xlu451’s stale review via b294123 July 1, 2026 09:39

fwd4 requested review from lfengad, mli0603 and ychao-nvidia July 1, 2026 09:40

libero: mark launch_sft_action_policy_libero_all.sh executable

884bb1c

Fixes pre-commit check-shebang-scripts-are-executable. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

fwd4 requested a review from pengcuo July 2, 2026 06:09

pengcuo previously approved these changes Jul 2, 2026

View reviewed changes

lfengad previously approved these changes Jul 2, 2026

View reviewed changes

fwd4 and others added 2 commits July 2, 2026 15:22

fwd4 dismissed stale reviews from lfengad and pengcuo via 8e9f8f3 July 2, 2026 07:23

Dinghow approved these changes Jul 2, 2026

View reviewed changes

Xuanmeng-Zhang approved these changes Jul 2, 2026

View reviewed changes

fwd4 merged commit 33ea6a7 into NVIDIA:main Jul 2, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Cosmos3-Nano LIBERO-10 action-policy SFT recipe, config, eval harness, and doc#61

Add Cosmos3-Nano LIBERO-10 action-policy SFT recipe, config, eval harness, and doc#61
fwd4 merged 17 commits into
NVIDIA:mainfrom
fwd4:haolia/libero-action-policy-sft

fwd4 commented Jun 26, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

Conversation

fwd4 commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Feature (net-new)

Recipe + doc — two presets (to match the Cosmos3 LIBERO-10 result)

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

fwd4 commented Jun 26, 2026 •

edited

Loading