torchagents

Implementations of RL algorithms using PyTorch.

Deep Q-Learning
Double Deep Q-Learning
Deep Deterministic Policy Gradient
Advantage Actor-Critic
Proximal Policy Optimization
Twin Delayed Deep Deterministic Policy Gradient

[dqn] Deep Q-Learning

[Playing Atari with Deep Reinforcement Learning]
Simple implementation of the deep Q-learning agent with experience replay and a target network that is periodically updated to match the value network.

[ddqn] Double Deep Q-Learning

[Deep Reinforcement Learning with Double Q-learning]
Same as DQN, except the online network is used for action selection.

[ddpg] Deep Deterministic Policy Gradient

[Continuous control with deep reinforcement learning]
Implementation of the deep deterministic policy gradient algorithm for continuous action spaces.

[a2c] Advantage Actor-Critic

[Asynchronous Methods for Deep Reinforcement Learning]
Advantage actor-critic with eligibility traces. Value function trains towards λ-weighted sum of n-step TD-targets.

[ppo] Proximal Policy Optimization

[Proximal Policy Optimization Algorithms]
Implementation of the clipping variant of PPO. Supports weight sharing between policy and value functions. Value function trains towards λ-weighted sum of n-step TD-targets. Generalized advantage estimation is used, truncated at end of episode or end of batch.

[td3] Twin Delayed Deep Deterministic Policy Gradient

[Addressing Function Approximation Error in Actor-Critic Methods]
As described in the paper.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
gifs		gifs
torchagents		torchagents
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

torchagents

[dqn] Deep Q-Learning

[ddqn] Double Deep Q-Learning

[ddpg] Deep Deterministic Policy Gradient

[a2c] Advantage Actor-Critic

[ppo] Proximal Policy Optimization

[td3] Twin Delayed Deep Deterministic Policy Gradient

About

Languages

License

BendeguzToth/torchagents

Folders and files

Latest commit

History

Repository files navigation

torchagents

[dqn] Deep Q-Learning

[ddqn] Double Deep Q-Learning

[ddpg] Deep Deterministic Policy Gradient

[a2c] Advantage Actor-Critic

[ppo] Proximal Policy Optimization

[td3] Twin Delayed Deep Deterministic Policy Gradient

About

Topics

Resources

License

Stars

Watchers

Forks

Languages