Skip to content

Latest commit

 

History

History
25 lines (18 loc) · 943 Bytes

File metadata and controls

25 lines (18 loc) · 943 Bytes

Proximal Policy Optimization (PPO) for CartPole-v1

Overview

This repository contains an implementation of PPO for solving the CartPole-v1 environment using PyTorch. The code is adapted from a tutorial with significant enhancements.

Key Features

  • Modern implementation using Gymnasium (successor to OpenAI Gym)
  • Performance monitoring with TensorBoard
  • Experiment tracking via Weights & Biases
  • Optimized hyperparameters for CartPole-v1

Demo

rl-video-step-400.mp4

Some findings

As a fellow neanderthal, I needed to make notes for myself:

Here is a notion link with some of my findings: Notion Link

Setup

Environment dependencies are managed through Conda. To get started:

conda env create -f environment.yml