Use Case 04: Novel Puzzle Game

A production-ready web-based puzzle game showcasing creativity, game design expertise, and comprehensive testing practices.

The Challenge

Design and implement a completely novel puzzle game from scratch. The game must be easy to understand but hard to master, with procedurally generated levels for infinite replayability. Most importantly, the game concept must be original—not a clone or variation of existing popular puzzle games.

The final deliverable should be production-ready: polished, thoroughly tested, cross-platform compatible, and deployable immediately.

See _base/prompt.md for the complete specification.

What This Tests

Creative Problem Solving & Originality

The requirement for a "novel" puzzle game tests:

Ability to ideate and innovate beyond pattern-matching existing solutions
Understanding of what makes games engaging and addictive
Capacity to design simple rules that create complex, emergent gameplay
Creative thinking unconstrained by example implementations

This is fundamentally different from implementing a spec—the agent must invent the spec itself.

Game Design Knowledge

Creating an addictive, well-balanced puzzle game requires:

Understanding of game design principles (learning curves, difficulty progression, reward loops)
Knowledge of what makes puzzle games compelling
Ability to balance simplicity with depth
Recognition that good games are easy to learn but hard to master
Design of fair, satisfying losing conditions

Web-Based Research Capabilities

The agent must actively search the web for:

Game design best practices and principles
Procedural generation techniques
What makes puzzle games addictive
Accessibility standards for web games
Current trends and gaps in the puzzle game space

This tests whether the agent can effectively use web tools to gather information and inform decisions.

Procedural Generation & Algorithms

Generating infinite, varied, playable levels tests:

Algorithmic thinking and implementation
Understanding of procedural generation techniques
Ability to ensure generated levels are always solvable/fair
Balancing randomness with structure
Performance optimization for generation algorithms

Cross-Platform Web Development

Building for all platforms (mobile, tablet, desktop) requires:

Responsive design implementation
Touch and mouse/keyboard input handling
Performance optimization for mobile browsers
Adaptive layouts and controls
Testing across different devices and screen sizes

Comprehensive Testing & QA

The requirement for production-ready quality tests:

Test-driven or test-supported development
Unit testing of game logic
Integration testing of game flow
Browser automation with Playwright MCP
Manual testing methodology
Bug identification and resolution
Quality assurance mindset

UI/UX Polish & Professional Presentation

Creating a "professional, minimalistic, highly polished" game tests:

Visual design sensibility
Animation and transition implementation
User feedback mechanisms (visual, haptic)
Color theory and visual hierarchy
Attention to detail in presentation
Accessibility considerations

Production-Ready Code Quality

Delivering deployment-ready software requires:

Clean, maintainable code architecture
Proper error handling
Performance optimization
Cross-browser compatibility
Build and deployment setup
Documentation quality

Why This Use Case?

This task was chosen because it requires creativity and research rather than just implementation skills:

No specification provided - Agent must research and design the game
Novel requirement - Can't rely on existing examples
Multi-disciplinary - Combines game design, algorithms, UI/UX, testing
Quality focus - Must be production-ready, not a prototype
Research-driven - Requires active web research for best practices

An agent that excels here demonstrates:

Meta-creativity: Ability to understand what makes something "good" and create accordingly
Research skills: Effective use of web tools to inform decisions
Domain expertise: Game design knowledge (or ability to acquire it)
Quality consciousness: Understanding of what "production-ready" means
Holistic thinking: Balancing gameplay, aesthetics, performance, and accessibility
Polish and craft: Attention to details that make experiences delightful

This is fundamentally different from other use cases:

vs. Research Scraper: Pure creativity instead of complex architecture
vs. Text Playground: Game design instead of education
vs. Image Generation: Originality instead of ML knowledge

Evaluation Focus

When reviewing implementations, pay attention to:

Game Design

Is the core mechanic truly novel and not derivative?
Is it immediately understandable?
Does it create a compelling "one more game" feeling?
Is there strategic depth despite simple rules?
Is the losing condition fair and well-designed?
Does difficulty scale appropriately?

Procedural Generation

Are generated levels varied and interesting?
Is the generation algorithm elegant?
Are all generated levels playable/fair?
Is generation performant?
Does the algorithm prevent repetitive patterns?

Research & Justification

Did the agent research game design principles?
Are design decisions explained and justified?
Does the game reflect best practices from research?
Is there evidence of studying what makes games engaging?

Technical Implementation

Is the code clean and well-structured?
Does it work across all major browsers?
Is it responsive on mobile devices?
Are controls intuitive on both touch and mouse/keyboard?
Is performance smooth on all devices?

Testing

Is there comprehensive test coverage?
Do tests actually catch issues?
Is Playwright MCP used effectively for browser testing?
Are edge cases handled?
Is the testing approach systematic?

Polish & Presentation

Does it look professional and polished?
Are animations smooth and satisfying?
Is visual feedback clear and helpful?
Is the UI minimalistic yet engaging?
Are there delightful micro-interactions?
Is accessibility considered?

Production Readiness

Could this be deployed immediately?
Is error handling robust?
Is the build process clean?
Is documentation complete?
Would users trust this as a real product?

Technical Constraints

Modern web technologies (agent's choice of framework/stack)
Cross-browser compatibility (Chrome, Firefox, Safari, Edge)
Responsive design (mobile, tablet, desktop)
Playwright MCP tools for browser automation testing
Client-side gameplay (no server required for core game)
Sound-optional (must be playable without audio)

Starting Point

The _base folder contains:

Complete task specification in prompt.md
No boilerplate code (agent has full creative freedom)

This is intentionally minimal to give the agent maximum flexibility to research, design, and implement a truly novel puzzle game.

Copy _base to your test folder and let your agent work from the prompt.md specification.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Case 04: Novel Puzzle Game

The Challenge

What This Tests

Creative Problem Solving & Originality

Game Design Knowledge

Web-Based Research Capabilities

Procedural Generation & Algorithms

Cross-Platform Web Development

Comprehensive Testing & QA

UI/UX Polish & Professional Presentation

Production-Ready Code Quality

Why This Use Case?

Evaluation Focus

Game Design

Procedural Generation

Research & Justification

Technical Implementation

Testing

Polish & Presentation

Production Readiness

Technical Constraints

Starting Point

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Use Case 04: Novel Puzzle Game

The Challenge

What This Tests

Creative Problem Solving & Originality

Game Design Knowledge

Web-Based Research Capabilities

Procedural Generation & Algorithms

Cross-Platform Web Development

Comprehensive Testing & QA

UI/UX Polish & Professional Presentation

Production-Ready Code Quality

Why This Use Case?

Evaluation Focus

Game Design

Procedural Generation

Research & Justification

Technical Implementation

Testing

Polish & Presentation

Production Readiness

Technical Constraints

Starting Point