Add optional pre-verification to skip doomed experiments#90
Open
gtsbahamas wants to merge 1 commit intokarpathy:masterfrom
Open
Add optional pre-verification to skip doomed experiments#90gtsbahamas wants to merge 1 commit intokarpathy:masterfrom
gtsbahamas wants to merge 1 commit intokarpathy:masterfrom
Conversation
Add an optional Assay verification step before training. Takes ~60-90s to check for shape mismatches, missing imports, and API violations. If the code looks broken, skip training instead of wasting 5 min GPU time. Tested with 12 LLM-generated architectural modifications on A10G: - 7/7 runs where verification executed: correctly predicted all crashes - Modifications tested: SiLU swap, dropout, MoE, sliding window attention, RMSNorm, doubled MLP width, ALiBi, architecture reshape - Average verification time: ~75 seconds Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds an optional pre-verification step to
program.md. Before spending 5 minutes on a training run, the agent can run a ~75-second check that extracts implicit claims from the code (shape expectations, import requirements, API contracts) and flags changes likely to crash.Why
Autonomous overnight runs produce ~100 experiments. Some crash immediately (OOM, shape mismatch, missing import). Each crash still burns 5 minutes of GPU time plus agent context window on stack trace diagnosis.
Pre-verification catches these before training starts.
Evidence
Ran 12 LLM-generated architectural modifications on an A10G GPU. For each, Claude generated a complete modified
train.py, verification ran, then training ran regardless to compare predictions vs reality.7/7 runs where verification executed correctly predicted the crash. 5 additional runs errored on code generation (Claude API failures), not verification.
Limitations (being honest)
ANTHROPIC_API_KEY, which won't be available in all environments. The step is marked optional for this reason.The change
One paragraph added to
program.mdunder "Crashes", marked as optional. No new files, no changes totrain.pyorprepare.py.Full experiment data: results TSV