Hello, I am very interested in this project. However, although the work is almost perfect, I think there still be some amendment to this.
Please refer to the line in train.py:
if terminal and (len(episode_rewards) % arglist.save_rate == 0):
I think your intention is that if the current episode has ended and the number of episodes is divisible by arglist.save_rate, then save the tained model and display the information.
However, terminal should be replaced by (terminal or done), since the episode can also be ended if the agents are done.