This guide walks through how to add a new Megatron model config to Primus for a real small model on Hugging Face that Primus does not configure yet.
As a concrete example, we will use:
- Hugging Face repo:
TinyLlama/TinyLlama-1.1B-Chat-v1.0 - New Primus model config:
tinyllama_1.1B.yaml
The goal is that you can follow this doc step by step and end up with a config
that works end‑to‑end with primus-cli direct.
For Megatron, there are three layers of YAML config involved:
-
Experiment config (entry point):
# examples/megatron/configs/MI300X/llama3.1_8B-BF16-pretrain.yaml modules: pre_trainer: framework: megatron config: pre_trainer.yaml # module-level trainer config # model to run model: llama3.1_8B.yaml # model config name
-
Module config (trainer-level defaults):
# primus/configs/modules/megatron/pre_trainer.yaml extends: - trainer_base.yaml # ...
-
Model config (architecture + tokenizer):
# primus/configs/models/megatron/llama3.1_8B.yaml extends: - llama3_8B.yaml tokenizer_type: HuggingFaceTokenizer tokenizer_model: meta-llama/Llama-3.1-8B max_position_embeddings: 131072
At runtime:
modules.pre_trainer.model: llama3.1_8B.yamlis resolved asprimus/configs/models/megatron/llama3.1_8B.yaml.extends: - llama3_8B.yamlfurther chains intoprimus/configs/models/megatron/llama3_8B.yaml, which in turn extendsllama3_base.yamland so on.
Your new model config will fit into the same resolution chain.
Because TinyLlama/TinyLlama-1.1B-Chat-v1.0 is not shipped in Primus,
there is no existing Megatron model YAML to extend from. You have two options:
- Option A (recommended): extend from a generic base like
language_model.yamland explicitly set all TinyLlama architecture fields. - Option B: extend from the closest existing model and override fields.
For clarity, this guide uses Option A.
You will need to look up the following values from the TinyLlama config on
Hugging Face (usually in config.json or the model card):
hidden_sizeffn_hidden_sizenum_attention_headsnum_layersnum_query_groups(if applicable)max_position_embeddings
You will copy those values into the new Megatron YAML.
Create a new file (locally) under:
primus/configs/models/megatron/tinyllama_1.1B.yaml
with the following template:
extends:
- language_model.yaml # generic Megatron language model base
tokenizer_type: HuggingFaceTokenizer
tokenizer_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
hidden_size: 2048
ffn_hidden_size: 5632 # intermediate_size in HF config.json
num_attention_heads: 32
num_layers: 22 # num_hidden_layers in HF config.json
num_query_groups: 8 # num_attention_heads / num_key_value_heads = 32 / 4
max_position_embeddings: 2048
position_embedding_type: ropeField meanings:
- extends:
- Re-uses generic language-model defaults.
- If you prefer, you can instead extend from the closest existing model
(e.g.
llama3_8B.yaml) and then override all differing fields.
- tokenizer_type:
- Case where you want to use a Hugging Face tokenizer implementation.
- tokenizer_model:
- Hugging Face model repository to load tokenizer from.
- Example:
TinyLlama/TinyLlama-1.1B-Chat-v1.0or your own fine-tuned fork.
Note: This file is not added to the repo in this PR. It is meant as a local example you can create in your workspace.
Pick an existing Megatron experiment config (or create one) under
examples/megatron/configs/... and change only the model field.
For example, starting from
examples/megatron/configs/MI300X/llama3.1_8B-BF16-pretrain.yaml, you can
create a new experiment config like this:
work_group: ${PRIMUS_TEAM:amd}
user_name: ${PRIMUS_USER:root}
exp_name: ${PRIMUS_EXP_NAME:tinyllama_1.1B-pretrain}
workspace: ${PRIMUS_WORKSPACE:./output}
modules:
pre_trainer:
framework: megatron
config: pre_trainer.yaml
# model to run
model: tinyllama_1.1B.yaml
overrides:
save: null
disable_last_saving: true
stderr_sink_level: DEBUG
mock_data: true
train_iters: 50
micro_batch_size: 2
global_batch_size: 128
seq_length: 2048All other overrides (batch size, seq length, LR, etc.) can stay the same initially; you can tune them later for your new model.
Use primus-cli direct to validate that the config chain resolves correctly
and the training loop can start.
./primus-cli direct -- \
train pretrain \
--config examples/megatron/configs/MI300X/tinyllama_1.1B-pretrain.yamlYou should see in the printed configuration and logs that:
- The framework is
megatron. - The model config being used is
tinyllama_1.1B.yaml. - The tokenizer matches your new file.
- Choose an appropriate base model under
primus/configs/models/megatron/(e.g.language_model.yaml, or a close existing LLaMA-style model). - Create
primus/configs/models/megatron/tinyllama_1.1B.yaml: -extends: - <base_model>.yaml- settokenizer_type,tokenizer_model- copy TinyLlama architecture fields (hidden_size,num_layers, etc.) - Update / create an experiment YAML under
examples/megatron/configs/...to referencemodel: tinyllama_1.1B.yaml. - Run
./primus-cli direct -- train pretrain --config ...to validate the config chain and run a short training job.
Once these steps are done, your new tinyllama_1.1B config behaves like any other
Megatron model in Primus and can be used in experiments, sweeps, and CI.