Proximal Policy Optimization (PPO) for CartPole-v1

Overview

This repository contains an implementation of PPO for solving the CartPole-v1 environment using PyTorch. The code is adapted from a tutorial with significant enhancements.

Key Features

Modern implementation using Gymnasium (successor to OpenAI Gym)
Performance monitoring with TensorBoard
Experiment tracking via Weights & Biases
Optimized hyperparameters for CartPole-v1

Demo

rl-video-step-400.mp4

Some findings

As a fellow neanderthal, I needed to make notes for myself:

Here is a notion link with some of my findings: Notion Link

Setup

Environment dependencies are managed through Conda. To get started:

conda env create -f environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proximal Policy Optimization (PPO) for CartPole-v1

Overview

Key Features

Demo

Some findings

Setup

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Proximal Policy Optimization (PPO) for CartPole-v1

Overview

Key Features

Demo

Some findings

Setup