Skip to content

Enhance experiment runner with deterministic controls#148

Closed
buzypi wants to merge 6 commits intokarpathy:masterfrom
buzypi:master
Closed

Enhance experiment runner with deterministic controls#148
buzypi wants to merge 6 commits intokarpathy:masterfrom
buzypi:master

Conversation

@buzypi
Copy link

@buzypi buzypi commented Mar 10, 2026

Summary

This PR adds a fork-specific execution workflow for autonomous experiments, replacing session-by-session program.md interpretation with a deterministic runner plus agent runbook.

What’s included

  • Add workflows/run_experiment.py as the single experiment orchestrator:
    • start, resume, status commands
    • top-level stage controls: setup, baseline, loop
    • loop sub-stage controls: propose, apply, commit, train, triage, record, decide
    • resumable checkpointing under workflows/runs/<run_id>/
    • run-id policy: <branch-slug>-rNNN
  • Add AGENTS.md runbook with explicit natural-language to command mapping for agent sessions.
  • Setup robustness:
    • auto-run uv run prepare.py when cache/tokenizer is missing (default on, opt-out via --no-auto-prepare)
    • explicit setup precondition checks before baseline/loop
  • Background training support:
    • training stages start in background by default (--background-train)
    • resume polls/continues in-flight baseline/train jobs
  • Update README:
    • OpenCode quickstart instructions
    • fork-specific explanation of why this workflow layer (run_experiment.py + AGENTS.md) is better than a program.md-only execution style

Why

In long-running autonomous sessions, prose-only execution is fragile and inconsistent.
This PR makes runs repeatable, resumable, and inspectable while preserving program.md as the policy/objective layer.

Copy link
Collaborator

@svlandeg svlandeg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to submit this to your fork instead? (the readme e.g. says "In this fork (buzypi/autoresearch)"

@svlandeg svlandeg closed this Mar 11, 2026
@buzypi
Copy link
Author

buzypi commented Mar 11, 2026

No, I wanted you to review it and if it looks feasible, I will generate a separate pull request with only the workflow runner.

I have created a separate PR with only the code to merge: #193

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants