As a RL enthousiast I've decided to implement many of the algorithms I found in books, courses or papers. To me, it is the best way to truly understand them.
In this repo, you will find implementation for many of the most known RL algorithms.
You will find the algorithms lists in the sub directories. I've decided to separate them into 3 classes:
-
Value Based Method: Algorithms that try to find the optimal policy by estimating the associated value function
$V^*(s)$ -
Policy Based Method: Algorithms that directly try to find the optimal policy
$\pi^*(a|s)$ - Actor-Critic Method: Algorithms that optimize both the value and the policy functions to find the optimal policy
The whole code is inpython3
.
Here you'll find major libraries I used:
- Environnement
- Gym (https://gym.openai.com/)
- Agent
- Numpy (https://numpy.org/)
- PyTorch (https://pytorch.org/)
- Visualization
- Matplotlib (https://matplotlib.org/)