You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, this looks like a really interesting set of algorithms. I wanted to try some out using the SB3-zoo and was hoping for a plug-and-play approach. I wondered if I could integrate rlexplore using callbacks so I came up with the following:
Then I include it in my list of callbacks and it seems to run. However, I'm still poking around without fully understanding what I'm doing (dangerous!) so does the above look correct? If it is correct, maybe it can be added as an example for others.
Second question is did I do this bit right: time_steps=self.num_timesteps?
Third question I have is that in the examples directory the sample uses rollout_buffer but is it valid to use this for Off Policy algorithms like DQN (switching for the replay_buffer instead?)
The text was updated successfully, but these errors were encountered:
Hi, this looks like a really interesting set of algorithms. I wanted to try some out using the SB3-zoo and was hoping for a plug-and-play approach. I wondered if I could integrate rlexplore using callbacks so I came up with the following:
Then I include it in my list of callbacks and it seems to run. However, I'm still poking around without fully understanding what I'm doing (dangerous!) so does the above look correct? If it is correct, maybe it can be added as an example for others.
Second question is did I do this bit right:
time_steps=self.num_timesteps
?Third question I have is that in the examples directory the sample uses
rollout_buffer
but is it valid to use this for Off Policy algorithms like DQN (switching for thereplay_buffer
instead?)The text was updated successfully, but these errors were encountered: