Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing Fig. 12 #26

Open
Acedorkz opened this issue Dec 18, 2024 · 3 comments
Open

Reproducing Fig. 12 #26

Acedorkz opened this issue Dec 18, 2024 · 3 comments

Comments

@Acedorkz
Copy link

Hi, I am currently working on reproducing the results presented in Fig. 12 for the ICM method and have encountered some challenges. Specifically, according to Fig. 12, the ICM reward appears to converge to approximately 10+ after 1e7 steps. However, when running the notebook provided at https://github.com/RLE-Foundation/RLeXplore/blob/main/1%20rlexplore_with_rllte.ipynb and setting the rewards to intrinsic rewards only (instead of the combined intrinsic and extrinsic rewards), I observed a reward of 30 at 5e6 steps.
Specifically, I change 'self.storage.rewards += intrinsic_rewards.to(self.device)' to 'self.storage.rewards = intrinsic_rewards.to(self.device)' at https://github.com/RLE-Foundation/rllte/blob/eeefdedb2ceee3ae1abfe88896cae3b8b62b4c05/rllte/common/prototype/on_policy_agent.py#L168.
This discrepancy has led me to question whether my understanding of Fig. 12 is correct. Could you kindly clarify the methodology or provide guidance on how to replicate the ICM results as depicted in Fig. 12?

Thank you for your time and for making such valuable resources available to the community. I appreciate any insights or suggestions you may offer. Have a nice day.

@yuanmingqi
Copy link
Collaborator

Share your email with me so I can send the exp code to u.

@Acedorkz
Copy link
Author

Share your email with me so I can send the exp code to u.

Thanks a lot.
[email protected]

@yuanmingqi
Copy link
Collaborator

sent via email.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants