Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 1 addition & 28 deletions examples/gr00t_n1_5/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ cd FlagScale/
pip install ".[cuda-train]" --verbose
```

Install additional dependencies for downloading models/datasets:
Install additional dependencies for downloading datasets:

```sh
# For HuggingFace Hub
Expand All @@ -38,33 +38,6 @@ pip install huggingface_hub
pip install modelscope
```

## Download Models

Download the pretrained GR00T N1.5 model using the provided script. Choose either HuggingFace Hub or ModelScope:

**Using HuggingFace Hub:**

```sh
cd FlagScale/
python examples/pi0/download.py \
--repo_id nvidia/GR00T-N1.5-3B \
--output_dir /workspace/models \
--source huggingface
```

**Using ModelScope:**

```sh
cd FlagScale/
python examples/pi0/download.py \
--repo_id nvidia/GR00T-N1.5-3B \
--output_dir /workspace/models \
--source modelscope
```

The model will be downloaded to (example with `/workspace/models`):
- `/workspace/models/nvidia/GR00T-N1.5-3B`

## Training

### Prepare Dataset
Expand Down
1 change: 1 addition & 0 deletions examples/gr00t_n1_5/conf/train.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ experiment:
CUDA_DEVICE_MAX_CONNECTIONS: 1
WANDB_MODE: offline
OTEL_SDK_DISABLED: true
HF_ENDPOINT: "https://hf-mirror.com"

action: run

Expand Down
2 changes: 1 addition & 1 deletion examples/gr00t_n1_5/conf/train/gr00t_n1_5.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ system:
model:
model_name: gr00t_n1_5
# Path or HuggingFace model ID for the pretrained GR00T N1.5 model
checkpoint_dir: /workspace/models/nvidia/GR00T-N1.5-3B
checkpoint_dir: nvidia/GR00T-N1.5-3B

# Fine-tuning control
tune_llm: true
Expand Down
41 changes: 1 addition & 40 deletions examples/pi0/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ pip install ".[cuda]" --verbose
pip install git+https://github.com/huggingface/transformers.git@fix/lerobot_openpi
```

Install additional dependencies for downloading models/datasets:
Install additional dependencies for downloading datasets:

```sh
# For HuggingFace Hub
Expand All @@ -40,45 +40,6 @@ pip install huggingface_hub
pip install modelscope
```

## Download Models and Tokenizers

Download models and tokenizers using the provided script. Choose either HuggingFace Hub or ModelScope based on your preference:

**Using HuggingFace Hub:**

```sh
cd FlagScale/
python examples/pi0/download.py \
--repo_id lerobot/pi0_base \
--output_dir /workspace/models \
--source huggingface

python examples/pi0/download.py \
--repo_id google/paligemma-3b-pt-224 \
--output_dir /workspace/models \
--source huggingface
```

**Using ModelScope:**

```sh
cd FlagScale/
python examples/pi0/download.py \
--repo_id lerobot/pi0_base \
--output_dir /workspace/models \
--source modelscope

python examples/pi0/download.py \
--repo_id google/paligemma-3b-pt-224 \
--output_dir /workspace/models \
--source modelscope
```

The models will be downloaded to (example with `/workspace/models`):
- `/workspace/models/lerobot/pi0_base`
- `/workspace/models/google/paligemma-3b-pt-224`


## Training

### Prepare Dataset
Expand Down
1 change: 1 addition & 0 deletions examples/pi0/conf/train.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ experiment:
CUDA_DEVICE_MAX_CONNECTIONS: 1
WANDB_MODE: offline
OTEL_SDK_DISABLED: true
FLAGSCALE_USE_MODELSCOPE: true

action: run

Expand Down
4 changes: 2 additions & 2 deletions examples/pi0/conf/train/pi0.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@ system:
model:
model_name: pi0
# Path to the pretrained pi0_base model checkpoint
checkpoint_dir: /workspace/models/lerobot/pi0_base
checkpoint_dir: lerobot/pi0_base
# Path to paligemma tokenizer
tokenizer_path: /workspace/models/google/paligemma-3b-pt-224
tokenizer_path: google/paligemma-3b-pt-224
tokenizer_max_length: 48

optimizer:
Expand Down
40 changes: 1 addition & 39 deletions examples/pi0_5/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ pip install ".[cuda]" --verbose
pip install git+https://github.com/huggingface/transformers.git@fix/lerobot_openpi
```

Install additional dependencies for downloading models/datasets:
Install additional dependencies for downloading datasets:

```sh
# For HuggingFace Hub
Expand All @@ -40,44 +40,6 @@ pip install huggingface_hub
pip install modelscope
```

## Download Models and Tokenizers

Download models and tokenizers using the provided script. Choose either HuggingFace Hub or ModelScope based on your preference:

**Using HuggingFace Hub:**

```sh
cd FlagScale/
python examples/pi0/download.py \
--repo_id lerobot/pi05_base \
--output_dir /workspace/models \
--source huggingface

python examples/pi0/download.py \
--repo_id google/paligemma-3b-pt-224 \
--output_dir /workspace/models \
--source huggingface
```

**Using ModelScope:**

```sh
cd FlagScale/
python examples/pi0/download.py \
--repo_id lerobot/pi05_base \
--output_dir /workspace/models \
--source modelscope

python examples/pi0/download.py \
--repo_id google/paligemma-3b-pt-224 \
--output_dir /workspace/models \
--source modelscope
```

The models will be downloaded to (example with `/workspace/models`):
- `/workspace/models/lerobot/pi05_base`
- `/workspace/models/google/paligemma-3b-pt-224`

## Training

### Prepare Dataset
Expand Down
1 change: 1 addition & 0 deletions examples/pi0_5/conf/train.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ experiment:
CUDA_DEVICE_MAX_CONNECTIONS: 1
WANDB_MODE: offline
OTEL_SDK_DISABLED: true
FLAGSCALE_USE_MODELSCOPE: true

action: run

Expand Down
4 changes: 2 additions & 2 deletions examples/pi0_5/conf/train/pi0_5.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@ system:
model:
model_name: pi0.5
# Path to the pretrained pi05_base model checkpoint
checkpoint_dir: /workspace/models/lerobot/pi05_libero_base
checkpoint_dir: lerobot/pi05_libero_base
# Path to paligemma tokenizer
tokenizer_path: /workspace/models/google/paligemma-3b-pt-224
tokenizer_path: google/paligemma-3b-pt-224
tokenizer_max_length: 200
gradient_checkpointing: true
freeze_vision_encoder: false
Expand Down
30 changes: 1 addition & 29 deletions examples/qwen_gr00t/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ cd FlagScale/
pip install ".[cuda-train]" --verbose
```

Install additional dependencies for downloading models/datasets:
Install additional dependencies for downloading datasets:

```sh
# For HuggingFace Hub
Expand All @@ -38,34 +38,6 @@ pip install huggingface_hub
pip install modelscope
```

## Download Models

Download the base VLM model. Qwen-GR00T supports Qwen3-VL and Qwen2.5-VL as the VLM backbone:

**Using HuggingFace Hub:**

```sh
cd FlagScale/
python examples/pi0/download.py \
--repo_id Qwen/Qwen3-VL-4B-Instruct \
--output_dir /workspace/models \
--source huggingface
```

**Using ModelScope:**

```sh
cd FlagScale/
python examples/pi0/download.py \
--repo_id Qwen/Qwen3-VL-4B-Instruct \
--output_dir /workspace/models \
--source modelscope
```

The model will be downloaded to (example with `/workspace/models`):
- `/workspace/models/Qwen/Qwen3-VL-4B-Instruct`


## Training

### Prepare Dataset
Expand Down
1 change: 1 addition & 0 deletions examples/qwen_gr00t/conf/train.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ experiment:
CUDA_DEVICE_MAX_CONNECTIONS: 1
WANDB_MODE: offline
OTEL_SDK_DISABLED: true
FLAGSCALE_USE_MODELSCOPE: true

action: run

Expand Down
2 changes: 1 addition & 1 deletion examples/qwen_gr00t/conf/train/qwen_gr00t.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ model:
model_name: qwen_gr00t
vlm:
type: qwen3-vl
base_vlm: /workspace/models/Qwen/Qwen3-VL-4B-Instruct/
base_vlm: Qwen/Qwen3-VL-4B-Instruct
attn_implementation: flash_attention_2
action_model:
# Whether to condition the action model on proprioceptive state (observation.state)
Expand Down
2 changes: 2 additions & 0 deletions flagscale/models/pi0/configuration_pi0.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@

from flagscale.models.configs.types import FeatureType, NormalizationMode, PolicyFeature
from flagscale.models.utils.constants import OBS_IMAGES
from flagscale.models.utils.hub_utils import resolve_model_path
from flagscale.models.vla.pretrained_config import PreTrainedConfig

DEFAULT_IMAGE_SIZE = 224
Expand Down Expand Up @@ -185,6 +186,7 @@ def _from_dict(cls, data: dict[str, Any]) -> "PI0Config":

@classmethod
def from_pretrained(cls, config_dir: str, **kwargs: Any) -> "PI0Config":
config_dir = resolve_model_path(config_dir)
config_path = os.path.join(config_dir, "config.json")
if not os.path.exists(config_path):
raise ValueError(f"config.json not found in {config_dir}")
Expand Down
2 changes: 2 additions & 0 deletions flagscale/models/pi05/configuration_pi05.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
import draccus

from flagscale.models.configs.types import FeatureType, NormalizationMode, PolicyFeature
from flagscale.models.utils.hub_utils import resolve_model_path
from flagscale.models.vla.pretrained_config import PreTrainedConfig

DEFAULT_IMAGE_SIZE = 224
Expand Down Expand Up @@ -170,6 +171,7 @@ def _from_dict(cls, data: dict[str, Any]) -> "PI05Config":

@classmethod
def from_pretrained(cls, config_dir: str, **kwargs: Any) -> "PI05Config":
config_dir = resolve_model_path(config_dir)
config_path = os.path.join(config_dir, "config.json")
if not os.path.exists(config_path):
raise ValueError(f"config.json not found in {config_dir}")
Expand Down
88 changes: 88 additions & 0 deletions flagscale/models/utils/hub_utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
import hashlib
import os
import tempfile
from pathlib import Path

import filelock

from flagscale.logger import logger

_lock_dir = tempfile.gettempdir()


def use_modelscope() -> bool:
return os.environ.get("FLAGSCALE_USE_MODELSCOPE", "false").lower() == "true"


# Copied from https://github.com/vllm-project/vllm/blob/1fc69f59bb0838c2ff6efc416dd8875c3e210d04/vllm/model_executor/model_loader/weight_utils.py
def _get_lock(model_name_or_path: str, cache_dir: str | None = None) -> filelock.FileLock:
lock_dir = cache_dir or _lock_dir
os.makedirs(lock_dir, exist_ok=True)
model_name = str(model_name_or_path).replace("/", "-")
hash_name = hashlib.sha256(model_name.encode()).hexdigest()[:16]
# add hash to avoid conflict with old users' lock files
lock_file = os.path.join(lock_dir, f"{hash_name}-{model_name}.lock")
# mode 0o666 is required for the filelock to be shared across users
return filelock.FileLock(lock_file, mode=0o666)


def resolve_model_path(
model_name_or_path: str,
revision: str | None = None,
cache_dir: str | None = None,
allow_patterns: list[str] | str | None = None,
ignore_patterns: list[str] | str | None = None,
) -> str:
"""Resolve a model name or HF/ModelScope repo ID to a local directory path.

If ``model_name_or_path`` is already a local directory, returns it as-is.
Otherwise downloads the model repo and returns the local cache path.

The download backend is selected by the ``FLAGSCALE_USE_MODELSCOPE`` env var:
- ``false`` (default): uses ``huggingface_hub.snapshot_download``
- ``true``: uses ``modelscope.hub.snapshot_download``

When ModelScope is enabled, ``HF_HUB_OFFLINE=1`` is set automatically so that
downstream HuggingFace calls (AutoTokenizer, cached_file, etc.) do not attempt
to reach huggingface.co.
"""
if use_modelscope():
os.environ.setdefault("HF_HUB_OFFLINE", "1")

if Path(model_name_or_path).is_dir():
logger.info(f"Model path is local directory: {model_name_or_path}")
return model_name_or_path

with _get_lock(model_name_or_path, cache_dir):
if use_modelscope():
logger.info(f"Downloading model from ModelScope: {model_name_or_path}")
from modelscope.hub.snapshot_download import snapshot_download

local_path = snapshot_download(
model_id=model_name_or_path,
cache_dir=cache_dir,
revision=revision,
ignore_file_pattern=ignore_patterns,
allow_patterns=allow_patterns,
)
else:
logger.info(f"Downloading model from HuggingFace Hub: {model_name_or_path}")
from huggingface_hub import snapshot_download

local_path = snapshot_download(
model_name_or_path,
repo_type="model",
revision=revision,
cache_dir=cache_dir,
allow_patterns=allow_patterns,
ignore_patterns=ignore_patterns,
)

if not local_path:
raise RuntimeError(
f"Failed to download model '{model_name_or_path}': "
"snapshot_download returned an empty path"
)

logger.info(f"Model resolved to: {local_path}")
return local_path
1 change: 1 addition & 0 deletions flagscale/models/vla/base_policy.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,7 @@ def from_config(cls, config: PreTrainedConfig) -> TrainablePolicy:
f"No policy registered for config type '{type_name}'. "
f"Known policies: {list(cls._registry.keys())}"
)
config.resolve_pretrained_paths()
return policy_cls(config=config)

def save_pretrained(self, save_directory, *, state_dict=None) -> None:
Expand Down
Loading
Loading