Skip to content

feat(autoresearch): add checkpoint, config, and queue management system#153

Open
BenBreaksIn wants to merge 2 commits intokarpathy:masterfrom
BenBreaksIn:2.0
Open

feat(autoresearch): add checkpoint, config, and queue management system#153
BenBreaksIn wants to merge 2 commits intokarpathy:masterfrom
BenBreaksIn:2.0

Conversation

@BenBreaksIn
Copy link

  • Add checkpoint.py for model save/load with resume capability
  • Add config.py for centralized configuration with environment variable overrides
  • Add gpu_queue.py for file-based GPU job queue and time-sharing across agents
  • Add knowledge.py for agent knowledge base and context management
  • Add launch.py for experiment orchestration and agent coordination
  • Add run_experiment.py for end-to-end experiment execution
  • Add programs/ directory with agent role definitions (analyst, director, explorer, optimizer, reviewer)
  • Update .gitignore to exclude checkpoints/ and *.log files
  • Update analysis.ipynb to support expanded results.tsv schema with additional metrics
  • Update prepare.py and train.py for integration with new checkpoint and config systems
  • Update program.md with comprehensive system documentation
  • Enables multi-agent autoresearch with GPU sharing, checkpointing, and structured experimentation

- Add checkpoint.py for model save/load with resume capability
- Add config.py for centralized configuration with environment variable overrides
- Add gpu_queue.py for file-based GPU job queue and time-sharing across agents
- Add knowledge.py for agent knowledge base and context management
- Add launch.py for experiment orchestration and agent coordination
- Add run_experiment.py for end-to-end experiment execution
- Add programs/ directory with agent role definitions (analyst, director, explorer, optimizer, reviewer)
- Update .gitignore to exclude checkpoints/ and *.log files
- Update analysis.ipynb to support expanded results.tsv schema with additional metrics
- Update prepare.py and train.py for integration with new checkpoint and config systems
- Update program.md with comprehensive system documentation
- Enables multi-agent autoresearch with GPU sharing, checkpointing, and structured experimentation
- Document variable time scales (probe, quick, standard, long, deep) for experiment budgets
- Describe research memory system with experiments.jsonl, lessons.jsonl, and journal.md
- Outline model checkpoint save/load functionality for scaling ladder
- Define five specialized agent roles (Explorer, Optimizer, Analyst, Reviewer, Director)
- Explain GPU queue priority system for multi-agent GPU sharing
- Provide file structure overview for new infrastructure and agent programs
- Include quick start examples for solo, duo, full org, and custom agent configurations
- Document standalone tools for briefing, manual experiments, lessons, and knowledge base sync
- List requirements (GPU, Python 3.10+, uv, Claude Code CLI)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant