You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, thank you for your outstanding work! I noticed in the disagreement algorithm paper that their sparse rewards differ from environmental rewards and do not use reinforcement learning for training. Instead, they directly optimize the agent using gradient backpropagation. Where is this specifically implemented in rlexplore?
The text was updated successfully, but these errors were encountered:
Hello, thank you for your outstanding work! I noticed in the disagreement algorithm paper that their sparse rewards differ from environmental rewards and do not use reinforcement learning for training. Instead, they directly optimize the agent using gradient backpropagation. Where is this specifically implemented in rlexplore?
The text was updated successfully, but these errors were encountered: