Skip to content

Commit be64f6b

Browse files
Deprecate examples/megatron-lm in favor of M-LM repo examples folder (#567)
- Deprecate `examples/megatron-lm` and move missing readme sections to https://github.com/NVIDIA/Megatron-LM/tree/main/examples/post_training/modelopt to keep only one copy of this doc. - Linked PR: NVIDIA/Megatron-LM#2273 Signed-off-by: Keval Morabia <[email protected]>
1 parent 6abded4 commit be64f6b

File tree

13 files changed

+20
-445
lines changed

13 files changed

+20
-445
lines changed

.github/CODEOWNERS

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,6 @@ modelopt/torch/utils @NVIDIA/modelopt-torch-utils-codeowners
4444
/examples/llm_ptq @NVIDIA/modelopt-examples-llm_ptq-codeowners
4545
/examples/llm_qat @NVIDIA/modelopt-examples-llm_qat-codeowners
4646
/examples/llm_sparsity @NVIDIA/modelopt-torch-sparsity-codeowners
47-
/examples/megatron-lm @NVIDIA/modelopt-examples-megatron-codeowners
4847
/examples/model_hub @NVIDIA/modelopt-examples-model_hub-codeowners
4948
/examples/nemo_run @NVIDIA/modelopt-examples-megatron-codeowners
5049
/examples/onnx_ptq @NVIDIA/modelopt-onnx-codeowners

CHANGELOG.rst

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,10 @@ Model Optimizer Changelog (Linux)
99

1010
- Add support for PyTorch Geometric quantization.
1111

12+
**Documentation**
13+
14+
- Deprecate ``examples/megatron-lm`` in favor of more detailed documentation in `Megatron-LM/examples/post_training/modelopt <https://github.com/NVIDIA/Megatron-LM/tree/main/examples/post_training/modelopt>`_.
15+
1216
**Misc**
1317

1418
- Bump minimum recommended transformers version to 4.53.
@@ -75,7 +79,7 @@ Model Optimizer Changelog (Linux)
7579
- Upgrade TensorRT-LLM dependency to 1.1.0rc2.
7680
- Support Phi-4-multimodal and Qwen2.5-VL quantized HF checkpoint export in ``examples/vlm_ptq``.
7781
- Support storing and restoring Minitron pruning activations and scores for re-pruning without running the forward loop again.
78-
- Add Minitron pruning example for Megatron-LM framework. See ``examples/megatron-lm`` for more details.
82+
- Add Minitron pruning example for Megatron-LM framework. See `Megatron-LM/examples/post_training/modelopt <https://github.com/NVIDIA/Megatron-LM/tree/main/examples/post_training/modelopt>`_ for more details.
7983

8084
0.35 (2025-09-04)
8185
^^^^^^^^^^^^^^^^^

examples/llm_distill/README.md

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,8 @@ This section focuses on demonstrating how to apply Model Optimizer to perform kn
1313
| Pre-Requisites | Required & optional packages to use this technique | \[[Link](#pre-requisites)\] | |
1414
| Getting Started | Learn how to optimize your models using distillation to produce more intellegant smaller models | \[[Link](#getting-started)\] | \[[docs](https://nvidia.github.io/TensorRT-Model-Optimizer/guides/4_distillation.html)\] |
1515
| Support Matrix | View the support matrix to see compatibility and feature availability across different models | \[[Link](#support-matrix)\] | |
16-
| Distillation with NeMo | Learn how to distill your models with NeMo Framework | \[[Link](#knowledge-distillation-kd-for-nvidia-nemo-models)\] | \[[docs](https://nvidia.github.io/TensorRT-Model-Optimizer/guides/4_distillation.html)\] |
16+
| Distillation with Megatron-LM | Learn how to distill your models with Megatron-LM Framework | \[[Link](#knowledge-distillation-kd-in-nvidia-megatron-lm-framework)\] | |
17+
| Distillation with NeMo | Learn how to distill your models with NeMo Framework | \[[Link](#knowledge-distillation-kd-in-nvidia-nemo-framework)\] | \[[docs](https://nvidia.github.io/TensorRT-Model-Optimizer/guides/4_distillation.html)\] |
1718
| Distillation with Huggingface | Learn how to distill your models with Hugging Face | \[[Link](#knowledge-distillation-kd-for-huggingface-models)\] | \[[docs](https://nvidia.github.io/TensorRT-Model-Optimizer/guides/4_distillation.html)\] |
1819
| Resources | Extra links to relevant resources | \[[Link](#resources)\] | |
1920
| NeMo Prune + Distill Simplified Flow | Example script demonstrating end-to-end pruning plus distillation in NeMo | \[[Link](../nemo_run/prune_distill/README.md)\] | |
@@ -25,7 +26,7 @@ This section focuses on demonstrating how to apply Model Optimizer to perform kn
2526
### Docker
2627

2728
For Hugging Face models, please use the PyTorch docker image (e.g., `nvcr.io/nvidia/pytorch:25.06-py3`).
28-
For NeMo models, use the NeMo container (e.g., `nvcr.io/nvidia/nemo:25.07`) which has all the dependencies installed.
29+
For NeMo models, use the NeMo container (e.g., `nvcr.io/nvidia/nemo:25.09`) which has all the dependencies installed.
2930
Visit our [installation docs](https://nvidia.github.io/TensorRT-Model-Optimizer/getting_started/2_installation.html) for more information.
3031

3132
Also follow the installation steps below to upgrade to the latest version of Model Optimizer and install example-specific dependencies.
@@ -141,9 +142,13 @@ Loss balancers:
141142
| Qwen 3 | qwen3 ||
142143
| Mamba | mamba ||
143144

144-
## Knowledge Distillation (KD) for NVIDIA NeMo Models
145+
## Knowledge Distillation (KD) in NVIDIA Megatron-LM Framework
145146

146-
Checkout the stand-alone distillation script in the [NVIDIA NeMo repository](https://docs.nvidia.com/nemo-framework/user-guide/latest/model-optimization/distillation/distillation.html).
147+
Checkout the Knowledge Distillation example in the [Megatron-LM repository](https://github.com/NVIDIA/Megatron-LM/tree/main/examples/post_training/modelopt).
148+
149+
## Knowledge Distillation (KD) in NVIDIA NeMo Framework
150+
151+
Checkout the stand-alone distillation script in the [NeMo documentation](https://docs.nvidia.com/nemo-framework/user-guide/latest/model-optimization/distillation/distillation.html).
147152

148153
You can also look at the NeMo tutorial notebooks [here](https://github.com/NVIDIA-NeMo/NeMo/tree/main/tutorials/llm/qwen/pruning-distillation) which showcase the usage of Minitron pruning followed by distillation for Qwen 3 8B step-by-step in NeMo framework. Hugging Face models can also be converted to NeMo format and used subsequently as shown in the tutorial.
149154

examples/llm_ptq/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ This section focuses on Post-training quantization, a technique that reduces mod
2828
### Docker
2929

3030
For Hugging Face models, please use the TensorRT-LLM docker image (e.g., `nvcr.io/nvidia/tensorrt-llm/release:1.1.0rc2.post2`).
31-
For NeMo models, use the NeMo container (e.g., `nvcr.io/nvidia/nemo:25.07`).
31+
For NeMo models, use the NeMo container (e.g., `nvcr.io/nvidia/nemo:25.09`).
3232
Visit our [installation docs](https://nvidia.github.io/TensorRT-Model-Optimizer/getting_started/2_installation.html) for more information.
3333

3434
Also follow the installation steps below to upgrade to the latest version of Model Optimizer and install example-specific dependencies.
@@ -260,7 +260,7 @@ accelerate launch --config_file fsdp2.yaml \
260260
--calib_size <num_calib_samples> \
261261
--dataset <dataset> \
262262
--export_path <export_path> \
263-
--trust_remote_code
263+
--trust_remote_code
264264
```
265265

266266
The exported checkpoint can be deployed using TensorRT-LLM/ vLLM/ SGLang. For more details refer to the [deployment section](#deployment) of this document.

examples/megatron-lm/ADVANCED.md

Lines changed: 0 additions & 50 deletions
This file was deleted.

examples/megatron-lm/Dockerfile

Lines changed: 0 additions & 20 deletions
This file was deleted.

examples/megatron-lm/README.md

Lines changed: 0 additions & 180 deletions
This file was deleted.

examples/megatron-lm/config/moonshotai/kimi_k2_instruct.sh

Lines changed: 0 additions & 21 deletions
This file was deleted.

examples/megatron-lm/config/moonshotai/kimi_k2_instruct_export.sh

Lines changed: 0 additions & 29 deletions
This file was deleted.

0 commit comments

Comments
 (0)