size issue on GAE process #18

davincibj · 2021-11-09T07:02:10Z

While study your Mario PPO codes, https://github.com/uvipen/Super-mario-bros-PPO-pytorch/blob/master/train.py, it’s hard to understand the following codes:

################################################################################
values = torch.cat(values).detach() # torch.Size([4096])

states = torch.cat(states)
gae = 0
R = []
for value, reward, done in list(zip(values, rewards, dones))[::-1]: # len(list(zip(values, rewards, dones))[::-1]) is 512
gae = gae * opt.gamma * opt.tau
gae = gae + reward + opt.gamma * next_value.detach() * (1 - done) - value.detach()
next_value = value
R.append(gae + value)
##################################################################################

Question: with  --num_local_steps=512 and  —num_processes=8,  after 'values = torch.cat(values).detach()’, the values.shape is torch.Size([4096]). But this list:  "list(zip(values, rewards, dones))[::-1]”,  the length is 512,   which mean only the first 512 items “values" are used in the "for…loop”,  the others are discarded.

So, in every 512 local_steps, only the values of first 64(=512/8) steps are used to calculate GAE and R.  Is it a problem or I have misunderstanding?

Looking for your answer, thanks!

The text was updated successfully, but these errors were encountered:

42kun · 2022-05-17T06:37:49Z

This code design is completely wrong！

42kun · 2022-05-17T06:39:48Z

and gae not set zero when done

zhuzhu18 · 2023-10-23T11:47:28Z

see here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

size issue on GAE process #18

size issue on GAE process #18

davincibj commented Nov 9, 2021

42kun commented May 17, 2022

42kun commented May 17, 2022

zhuzhu18 commented Oct 23, 2023

size issue on GAE process #18

size issue on GAE process #18

Comments

davincibj commented Nov 9, 2021

42kun commented May 17, 2022

42kun commented May 17, 2022

zhuzhu18 commented Oct 23, 2023