Introduce Component-System for component optimization, add a dashboard to monitor activities#142
Open
LaurenceLong wants to merge 20 commits intokarpathy:masterfrom
Open
Introduce Component-System for component optimization, add a dashboard to monitor activities#142LaurenceLong wants to merge 20 commits intokarpathy:masterfrom
LaurenceLong wants to merge 20 commits intokarpathy:masterfrom
Conversation
svlandeg
reviewed
Mar 10, 2026
Collaborator
svlandeg
left a comment
There was a problem hiding this comment.
The dashboard looks awesome, but this PR with its 43 files changed is probably not very compatible with this repo's readme statement
The repo is deliberately kept small and only really has a three files that matter
😉
353d276 to
5a9c76f
Compare
Author
I agree. It doesn’t really fit the repo, so I won’t merge it, but I’d like to keep it around for anyone who wants to use it. |
…d to monitor activities
…d to monitor activities
…d to monitor activities
Improve NaN loss handling and add beginner's guide resource
- Add max_grad_norm parameter (default 1.0) to TrainingSettings - Import torch.nn.utils for clip_grad_norm_ - Apply gradient clipping before optimizer.step() - This should improve training stability and potentially achieve better convergence Target component: trainer Expected benefit: Better training stability leading to improved val_bpb
Training: - Make gradient clipping optional (max_grad_norm: float | None = None) - Disabled by default to test training stability without clipping - Can be enabled by setting max_grad_norm to a positive value Workflow: - Add _ralph_try_restore_worktree helper for consistent worktree reset logic - Fix baseline waiting: allow planning on non-baseline branches without baseline - Restore worktree on run failures and before P queueing when Ralph loop enabled - Minor code formatting improvements in workflow.py
SwiGLU (SiLU Gated Linear Unit) is a proven improvement over ReLU² from 'GLU Variants Improve Transformer' paper. Uses gated mechanism with SiLU activation for better model capacity. Changes: - Replace c_fc/c_proj with gate_proj/up_proj/down_proj - Use F.silu(up) * gate instead of ReLU² - Adjust hidden_dim to maintain comparable parameters - Update weight initialization for new layers
…st sync, track former_val_bpb, improve workflow merge order
…and concurrency handling - Update PDCA documentation for clarity and consistency - Add seed lifecycle and concurrency review documentation - Track former_val_bpb for better signal evaluation - Implement best_val_bpb tracking with history - Add metrics recovery DCA for missing metrics - Improve timeout handling (900s for DCA runs) - Update web dashboard UI and templates - Fix sync resolution and merge conflict handling - Update protocol.md with simplified instructions
These checkpoint files are auto-generated by Jupyter and are already listed in .gitignore. Removing them to keep the repository clean.
- Add prompt_audit directory to .gitignore - Enhance worktree detection in run.py with separate workflows for worktree vs root modes - Add merge resolution logic and temp branch handling in workflow.py - Improve DCA stage with better baseline promotion tracking
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Introduce Component-System for component optimization: