Replies: 2 comments
-
|
Thank you for posting this. I'll move this post to our Discussions section for follow-up, though here are a few notes to consider. The distillation setup itself is almost correct; the exploding behavior loss usually comes from a mismatch between what the student sees and what the teacher sees, or from loss/normalization settings rather than just the Correct obs group setupFor Isaac Lab’s
So, given your env exposes groups # in the runner cfg (distillation)
obs_groups = {
"policy": ["policy"], # student inputs
"teacher": ["policy", "privileged_info"], # teacher inputs (same as teacher training)
}That part matches what you already did and is conceptually correct.12 Other things you must checkBecause your behavior loss goes to ~100 and stays high, it is likely due to one or more of the following rather than the obs_groups themselves:3
Footnotes |
Beta Was this translation helpful? Give feedback.
-
|
In my own RL and distillation environments, I have confirmed that normalising the actions eliminates the behaviour-loss overshooting problem. However I met some other problem which I have posted here. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi there,
I am facing question of using the distillation cfg.
I have trained my teacher network with the observation group:
Then in the
rsl_rl_distillation_cfg.py, I updated the observation group as followed:However, the behavior loss, as I saw from wandb, is getting huge and can up to 100 and it doesn't go down.
What are the correct way to setting up Teacher-Student Training in IsaacLab?
Beta Was this translation helpful? Give feedback.
All reactions