Add PCCL-based DDP communication hook by keshprad · Pull Request #4 · axonn-ai/nanoGPT

keshprad · 2026-02-19T02:23:18Z

No description provided.

The warmup learning rate calculation has been modified to use (it + 1)/(warmup_iters + 1) instead of it/warmup_iters. This ensures a non-zero learning rate at iteration 0 while maintaining the same linear warmup behavior. Fixes karpathy#443

…37-fix-warmup-lr fix: ensure non-zero learning rate during warmup at iteration 0

…rontier scaling support

devin-ai-integration Bot and others added 8 commits December 9, 2024 07:35

Merge pull request karpathy#578 from devin-open-source/devin/17337283…

93a43d9

…37-fix-warmup-lr fix: ensure non-zero learning rate during warmup at iteration 0

init scripts for patch ddp with pccl

75b9cf9

fix slurm script

a11dc2e

updated scripts to use newer environment

fb61120

updated experiment using 1.3B model and node-local NVMe storage for f…

4b457f3

…rontier scaling support

use bucket_cap_mb as parameter

694c1fe

Merge branch 'pccl-ddp' into pccl-ddp

d56b6f4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add PCCL-based DDP communication hook#4

Add PCCL-based DDP communication hook#4
keshprad wants to merge 8 commits into
axonn-ai:pccl-ddpfrom
keshprad:pccl-ddp

keshprad commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

keshprad commented Feb 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants