[Question] Distillation Not Working/What is the correct way for Student-Teacher Distillation #4155

iiisaac40 · 2025-12-01T10:42:55Z

iiisaac40
Dec 1, 2025

Hi there,

I am facing question of using the distillation cfg.

I have trained my teacher network with the observation group:

obs_groups = {"policy": ["policy", "privileged_info"], "critic": ["policy", "privileged_info"]}

Then in the rsl_rl_distillation_cfg.py, I updated the observation group as followed:

obs_groups = {"policy": ["policy" ], "teacher": ["policy", "privileged_info"]}

However, the behavior loss, as I saw from wandb, is getting huge and can up to 100 and it doesn't go down.

What are the correct way to setting up Teacher-Student Training in IsaacLab?

RandomOakForest · 2025-12-05T13:32:38Z

RandomOakForest
Dec 5, 2025
Maintainer

Thank you for posting this. I'll move this post to our Discussions section for follow-up, though here are a few notes to consider.

The distillation setup itself is almost correct; the exploding behavior loss usually comes from a mismatch between what the student sees and what the teacher sees, or from loss/normalization settings rather than just the obs_groups mapping.¹²

Correct obs group setup

For Isaac Lab’s RslRlDistillationStudentTeacherCfg, the intended mapping is:²¹

Teacher:
- Uses the same observation group you trained it with (policy + privileged).
Student:
- Uses only the deployment observations (policy only).

So, given your env exposes groups "policy" and "privileged_info", the recommended mapping in rsl_rl_distillation_cfg.py is:

# in the runner cfg (distillation)
obs_groups = {
    "policy": ["policy"],                       # student inputs
    "teacher": ["policy", "privileged_info"],   # teacher inputs (same as teacher training)
}

That part matches what you already did and is conceptually correct.¹²

Other things you must check

Because your behavior loss goes to ~100 and stays high, it is likely due to one or more of the following rather than the obs_groups themselves:³

Teacher checkpoint & architecture
- The distillation policy config must define a teacher network with the same input dimension and action dimension as your trained teacher (i.e., same obs_groups and same action space).³
- Ensure the teacher is set to eval() and not updated during training (teacher optimizer disabled, only student optimized).³
Observation normalization
- If your teacher was trained with obs normalization enabled, the distillation runner should match:
  - student_obs_normalization=True if you want normalized inputs for the student.
  - teacher_obs_normalization=True if the teacher was originally trained with normalized obs.¹³
- A mismatch (e.g., teacher trained with normalization but distillation uses raw obs) can cause large action errors and thus large behavior loss.
Loss scale / type
- Check loss_type in RslRlDistillationAlgorithmCfg (e.g., MSE on actions) and ensure that:
  - Actions are in the same range as during teacher training (e.g., tanh outputs in $[-1,1]$).³
  - You are not unintentionally multiplying behavior loss by a very large coefficient (e.g., behavior_loss_coef or similar parameter if exposed in your branch).³
- If actions are high-variance or unnormalized, MSE can easily be O(10–100) at the beginning, but should start decreasing if everything else is consistent.

0 replies

yijionglin · 2025-12-05T17:20:02Z

yijionglin
Dec 5, 2025

In my own RL and distillation environments, I have confirmed that normalising the actions eliminates the behaviour-loss overshooting problem. However I met some other problem which I have posted here.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] Distillation Not Working/What is the correct way for Student-Teacher Distillation #4155

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Question] Distillation Not Working/What is the correct way for Student-Teacher Distillation #4155

Uh oh!

iiisaac40 Dec 1, 2025

Replies: 2 comments

Uh oh!

RandomOakForest Dec 5, 2025 Maintainer

Correct obs group setup

Other things you must check

Footnotes

Uh oh!

yijionglin Dec 5, 2025

iiisaac40
Dec 1, 2025

RandomOakForest
Dec 5, 2025
Maintainer

yijionglin
Dec 5, 2025