Skip to content

BendeguzToth/torchagents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

torchagents

Implementations of RL algorithms using PyTorch.

  • Deep Q-Learning
  • Double Deep Q-Learning
  • Deep Deterministic Policy Gradient
  • Advantage Actor-Critic
  • Proximal Policy Optimization
  • Twin Delayed Deep Deterministic Policy Gradient

[dqn] Deep Q-Learning

[Playing Atari with Deep Reinforcement Learning]
Simple implementation of the deep Q-learning agent with experience replay and a target network that is periodically updated to match the value network.

[ddqn] Double Deep Q-Learning

[Deep Reinforcement Learning with Double Q-learning]
Same as DQN, except the online network is used for action selection.

[ddpg] Deep Deterministic Policy Gradient

[Continuous control with deep reinforcement learning]
Implementation of the deep deterministic policy gradient algorithm for continuous action spaces.

[a2c] Advantage Actor-Critic

[Asynchronous Methods for Deep Reinforcement Learning]
Advantage actor-critic with eligibility traces. Value function trains towards λ-weighted sum of n-step TD-targets.

[ppo] Proximal Policy Optimization

[Proximal Policy Optimization Algorithms]
Implementation of the clipping variant of PPO. Supports weight sharing between policy and value functions. Value function trains towards λ-weighted sum of n-step TD-targets. Generalized advantage estimation is used, truncated at end of episode or end of batch.

[td3] Twin Delayed Deep Deterministic Policy Gradient

[Addressing Function Approximation Error in Actor-Critic Methods]
As described in the paper.

About

PyTorch implementations of RL algorithms.

Topics

Resources

License

Stars

Watchers

Forks

Languages