Vab-jain / rl_llm Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Playground repo for using Large Language Models for RL

0 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
tutorials		tutorials
.gitignore		.gitignore
README.md		README.md

Repository files navigation

rl_llm

A collection of tutorials and demos for reinforcement learning (RL) with large language models (LLMs).

Contents

Example scripts for training and playing with RL agents and LLMs
Tic-Tac-Toe PPO demos
Llama model learning scripts

Libraries & Technologies Used

transformers (Hugging Face) for LLMs (Llama, TinyLlama)
PyTorch for deep learning
stable-baselines3 and sb3_contrib for RL algorithms (PPO, MaskablePPO)
PettingZoo for multi-agent RL environments (Tic-Tac-Toe, Connect Four)
Gymnasium for RL environment interface
NumPy

Algorithms

Proximal Policy Optimization (PPO)
Maskable PPO (for environments with invalid action masking)

Usage

See the tutorials/ directory for example scripts and usage.

About

Playground repo for using Large Language Models for RL

Report repository

Releases

No releases published

Packages

No packages published

Contributors 2

Languages

Python 100.0%