Reproducing Fig. 12 #26

Acedorkz · 2024-12-18T12:37:03Z

Hi, I am currently working on reproducing the results presented in Fig. 12 for the ICM method and have encountered some challenges. Specifically, according to Fig. 12, the ICM reward appears to converge to approximately 10+ after 1e7 steps. However, when running the notebook provided at https://github.com/RLE-Foundation/RLeXplore/blob/main/1%20rlexplore_with_rllte.ipynb and setting the rewards to intrinsic rewards only (instead of the combined intrinsic and extrinsic rewards), I observed a reward of 30 at 5e6 steps.
Specifically, I change 'self.storage.rewards += intrinsic_rewards.to(self.device)' to 'self.storage.rewards = intrinsic_rewards.to(self.device)' at https://github.com/RLE-Foundation/rllte/blob/eeefdedb2ceee3ae1abfe88896cae3b8b62b4c05/rllte/common/prototype/on_policy_agent.py#L168.
This discrepancy has led me to question whether my understanding of Fig. 12 is correct. Could you kindly clarify the methodology or provide guidance on how to replicate the ICM results as depicted in Fig. 12?

Thank you for your time and for making such valuable resources available to the community. I appreciate any insights or suggestions you may offer. Have a nice day.

yuanmingqi · 2024-12-23T08:59:44Z

Share your email with me so I can send the exp code to u.

Acedorkz · 2024-12-23T10:23:42Z

Share your email with me so I can send the exp code to u.

Thanks a lot.
[email protected]

yuanmingqi · 2024-12-26T07:02:27Z

sent via email.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing Fig. 12 #26

Reproducing Fig. 12 #26

Acedorkz commented Dec 18, 2024

yuanmingqi commented Dec 23, 2024

Acedorkz commented Dec 23, 2024

yuanmingqi commented Dec 26, 2024

Reproducing Fig. 12 #26

Reproducing Fig. 12 #26

Comments

Acedorkz commented Dec 18, 2024

yuanmingqi commented Dec 23, 2024

Acedorkz commented Dec 23, 2024

yuanmingqi commented Dec 26, 2024