Hi,
Do you think the algorithm can be easily adapted to environments with continuous action space. If so, what changes do you think are required. One of the things that I think that need to be converted is the summation term in the denominator of eq. 11 of the paper (https://arxiv.org/pdf/2102.06483.pdf). Do you have any ideas/pointers on how to achieve it? Additionally, will it require any other modification than the one above for adapting it to continuous action spaces.
Thank you!
Hi,
Do you think the algorithm can be easily adapted to environments with continuous action space. If so, what changes do you think are required. One of the things that I think that need to be converted is the summation term in the denominator of eq. 11 of the paper (https://arxiv.org/pdf/2102.06483.pdf). Do you have any ideas/pointers on how to achieve it? Additionally, will it require any other modification than the one above for adapting it to continuous action spaces.
Thank you!