RLplayground — A Reinforcement Learning Playground for Self-Learning Agents

🏆 Project Goal

This project is an educational exploration and demonstration of Reinforcement Learning (RL) concepts through a simple but meaningful interactive simulation.
The main objective is to build a flexible Android app in Kotlin/Compose where a virtual agent learns how to throw a bouncing ball into a cup through trial and error, improving its policy over many iterations using different RL algorithms.

Beyond the bouncing ball, this project aims to:

Provide a hands-on, visual playground for understanding core RL principles
Compare multiple RL strategies side by side on the same problem
Show real-time learning dynamics with detailed metrics and visualizations
Serve as a solid foundation to RL skills and extend to more complex problems later

💡 Motivation

The field of Reinforcement Learning is both fascinating and challenging, bridging the gap between theory and practical autonomous decision-making.
As an Android developer keen on expanding my skills in machine learning and AI, I wanted a project that:

Is hands-on and visual, helping me truly grasp RL dynamics
Covers multiple fundamental RL algorithms to understand their strengths and weaknesses
Offers real-time feedback on learning progress through intuitive visualizations
Can serve as a playground to experiment and grow, with potential extensions beyond simple environments
Is fully implemented in Kotlin/Compose, showcasing modern Android tech with a deep learning twist

This project is my stepping stone towards Reinforcement Learning, combining software craftsmanship with AI exploration.

⚙️ Key Concepts

Environment: A simple 2D physics simulation with gravity, bounces, obstacles, and a target cup
Agent: Learns a throwing policy (angle, power) to maximize success
Rewards: Designed to encourage landing the ball in the cup and penalize misses
Learning: Agent updates its policy based on feedback from environment interactions
Visualization: Real-time trajectories, heatmaps, and learning metrics to track progress

🎯 Reinforcement Learning Strategies Implemented

1. Q-Learning

Concept:
Q-Learning is an off-policy, value-based RL method that learns a table of Q-values ( Q(s,a) ), representing the expected cumulative reward of taking action ( a ) in state ( s ).
The agent updates Q-values using the Bellman equation by observing rewards and next states, learning an optimal policy by greedily selecting actions with the highest Q-values.
How it works:

Q(s, a) = Q(s, a) + α [r + γ * max(Q(s', a')) - Q(s, a)]

where α is learning rate, γ is discount factor, r is reward, and s' is the next state.

Limitations:
Requires discrete, manageable state and action spaces. For continuous domains, discretization or function approximation is needed, which can reduce precision or increase complexity.

2. SARSA (State-Action-Reward-State-Action)

Concept:
SARSA is an on-policy, value-based RL method similar to Q-Learning, but updates are based on the actual next action taken by the current policy, not the max action.
This makes it more conservative and often safer in some environments.
Update rule:

Q(s, a) = Q(s, a) + α [r + γ * Q(s', a') - Q(s, a)]

where a' is the action taken in state s'.

Limitations:
Same as Q-Learning — depends on discrete states and actions; convergence depends on exploration policies.

🕹️ The Playground - 2D Physics Environment

Key Physical Parameters

Parameter	Default Value	Description
Gravity (g)	9.8 px/s²	Constant vertical acceleration
Bounce coefficient	0.7	Fraction of velocity conserved after bounce (0 < e < 1)
Friction (optional)	0.02	Horizontal speed decay over time
Delta Time	~16 ms (60 FPS)	Physics update interval

Environment Elements

Ball: 2D position, velocity, and acceleration vectors
Obstacles: Static rectangular shapes with collision detection
Cup (target): Fixed zone that defines success if ball lands inside
Field boundaries: Walls and floor with bounce logic or reset conditions

📊 Real-Time Visualization & Monitoring

UI Information Displayed

Live ball trajectory path with fading trail
Throw history visualization with color-coded success/failure (ghost throws)
Impact heatmap showing frequently landed zones
Reward over time graph showing learning progress
Iteration/episode count
Running average and variance of recent rewards
Exploration rate (epsilon) for ε-greedy policies
Policy visualization (action probability distributions) for policy gradient and actor-critic methods
Q-value heatmaps or tables for discrete methods
Controls: start, pause, reset simulation
Strategy selector dropdown

🛠️ Project Architecture

Environment module: physics simulation, collision, reward calculation
Agent module: RL algorithms implementing decision and learning steps
SimulationManager: Orchestrates interaction loops and episode handling
UI with Jetpack Compose: physics visualization, metrics, and controls
DataVisualization: real-time charts, heatmaps, logs

🚀 Development Roadmap

Build the physics environment with gravity, bouncing ball, and cup target
Implement random agent baseline for testing environment
Implement Q-Learning and SARSA with discrete state/action representation
Add throw visualization, heatmaps, and reward graphs
Add modular RL strategy selector and UI enhancements
Optimize performance and polish UI/UX
Write documentation and prepare portfolio demo

🚀 Next Steps

Implement environment physics and playground UI in Compose
Code baseline Random and Q-Learning agents
Add UI for monitoring RL metrics in real-time
Explore extensions with obstacles or more complex tasks

*This project is a continuous journey into Reinforcement Learning, blending practical Android development with AI concepts.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.idea		.idea
app		app
gradle		gradle
.gitignore		.gitignore
README.md		README.md
build.gradle.kts		build.gradle.kts
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RLplayground — A Reinforcement Learning Playground for Self-Learning Agents

🏆 Project Goal

💡 Motivation

⚙️ Key Concepts

🎯 Reinforcement Learning Strategies Implemented

1. Q-Learning

2. SARSA (State-Action-Reward-State-Action)

🕹️ The Playground - 2D Physics Environment

Key Physical Parameters

Environment Elements

📊 Real-Time Visualization & Monitoring

UI Information Displayed

🛠️ Project Architecture

🚀 Development Roadmap

🚀 Next Steps

About

Uh oh!

Releases

Packages

Uh oh!

Languages

JumpingKeyCaps/RLplayground

Folders and files

Latest commit

History

Repository files navigation

RLplayground — A Reinforcement Learning Playground for Self-Learning Agents

🏆 Project Goal

💡 Motivation

⚙️ Key Concepts

🎯 Reinforcement Learning Strategies Implemented

1. Q-Learning

2. SARSA (State-Action-Reward-State-Action)

🕹️ The Playground - 2D Physics Environment

Key Physical Parameters

Environment Elements

📊 Real-Time Visualization & Monitoring

UI Information Displayed

🛠️ Project Architecture

🚀 Development Roadmap

🚀 Next Steps

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages