Add Cosmos3-Nano LIBERO-10 action-policy SFT recipe, config, eval harness, and doc#61
Merged
Merged
Conversation
…ness, and doc Mirrors the DROID action-policy counterpart (action_policy_droid_nano + repro toml + launch + doc). Net-new LIBERO feature: - experiment config: action_policy_libero_nano - dataset: LIBEROLeRobotDataset + get_action_libero_sft_dataset (frame_wise_relative rot6d, quantile_rot, concat_view, 20fps); base_dataset tasks.parquet fallback for community LIBERO layouts; resample-on-decode-failure guard (matches i4 behavior) - closed-loop eval harness (vectorized sim) + batched /predict_batch inference path + single-rank no_dist checkpoint load for the policy server - canonical recipe action_policy_libero_repro.toml + launch_sft_action_policy_libero.sh (lr 5e-5, warmup 500, cycle 16000, global batch 2048; ~95% libero_10 500-ep eval) - docs/action_policy_libero_sft.md Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Lean the toml/config/launch/doc comments (drop SR numbers and experimental detail), and set the canonical recipe to HSDP 2x8 with grad_accum=1 (global batch 2048) instead of single-node grad_accum=2.
- action_sft_dataset.py: rebuild as origin/main + libero-only (drop the speedup-era ShardedDROIDLeRobotDataset import that broke config load on a clean main). - remove dataset_reply_action_server.py (GT-replay debug tool, not part of the recipe). - drop DROID/LoRA references from libero docstrings/comments/doc/launch.
…etions); EGL setup optional in doc
…aunch -> gbs 2048)
bd18351 to
4d351dd
Compare
xlu451
previously approved these changes
Jun 30, 2026
…on training Reuse ActionPromptJsonFormatter at serve time so checkpoints trained with format_prompt_as_json=True receive the same structured JSON prompt at eval. Adds a --format-prompt-as-json CLI override (None reads the experiment config). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Plumb format_prompt_as_json through get_action_libero_sft_dataset to the ActionTransformPipeline, and default the action_policy_libero_nano recipe to structured JSON prompts (set False for plain text). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Equal 1:1:1:1 mix over libero_spatial/object/goal/10 on the public Cosmos3-Nano base with JSON prompts; LIBERO_ROOT is the LIBERO_LeRobot_v3 parent dir. Same recipe as action_policy_libero_nano otherwise. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… (B) Both lr 5e-5 / warmup 500 / cycle 16000 / gbs 2048 / JSON prompts. Preset A (libero_10-only, max_iter 2000) peaks ~1500. Preset B (libero-all equal 4-suite, max_iter 5000) is coverage-limited but reaches the best libero_10 SR (~95.6%). Adds the libero-all TOML + launcher; doc documents both. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Fixes pre-commit check-shebang-scripts-are-executable. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
pengcuo
previously approved these changes
Jul 2, 2026
lfengad
previously approved these changes
Jul 2, 2026
…policy-sft # Conflicts: # cosmos_framework/data/vfm/action/datasets/__init__.py # cosmos_framework/data/vfm/action/datasets/base_dataset.py # cosmos_framework/data/vfm/action/normalizer_stats/libero_native_frame_wise_relative_rot6d.json
Merged origin/main (i4 dataset port renamed stats/ -> normalizer_stats/ and rewrote base_dataset). Point libero's normalizer dir + doc at normalizer_stats/; ruff import cleanup; realign doc table. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Dinghow
approved these changes
Jul 2, 2026
Xuanmeng-Zhang
approved these changes
Jul 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds the Cosmos3-Nano LIBERO-10 action-policy SFT surface, mirroring the existing DROID counterpart (
action_policy_droid_nano+ toml + launcher + doc).Feature (net-new)
action_policy_libero_nano(libero_10-only) andaction_policy_libero_all_nano(equal 4-suite mix) — gen + action heads from the public Cosmos3-Nano base.LIBEROLeRobotDataset+get_action_libero_sft_dataset— frame_wise_relative rot6d,quantile_rot, concat_view (third-person + wrist), 20 fps.base_datasettasks.parquetfallback for community LIBERO layouts./predict_batch, single-rankno_distcheckpoint load.--format-prompt-as-json), so eval matches the training prompt format; the recipe defaults to it.Recipe + doc — two presets (to match the Cosmos3 LIBERO-10 result)
Both lr 5e-5, warmup 500, cycle 16000, global batch 2048 (HSDP 2x8):
action_policy_libero_repro.toml+launch_sft_action_policy_libero.sh(max_iter 2000).action_policy_libero_all_repro.toml+launch_sft_action_policy_libero_all.sh(max_iter 5000;LIBERO_ROOT= LIBERO_LeRobot_v3 parent dir).docs/action_policy_libero_sft.mddocuments both.Notes
main.🤖 Generated with Claude Code