Skip to content

feat: make DEVICE_BATCH_SIZE configurable via env var#138

Open
0xmichalis wants to merge 1 commit intokarpathy:masterfrom
0xmichalis:feat/configurable-batch-size
Open

feat: make DEVICE_BATCH_SIZE configurable via env var#138
0xmichalis wants to merge 1 commit intokarpathy:masterfrom
0xmichalis:feat/configurable-batch-size

Conversation

@0xmichalis
Copy link

Allow overriding DEVICE_BATCH_SIZE with an environment variable so users with smaller GPUs can reduce it without editing source code, e.g.:

DEVICE_BATCH_SIZE=16 uv run train.py

grad_accum_steps auto-adjusts to preserve TOTAL_BATCH_SIZE.

Copilot AI review requested due to automatic review settings March 10, 2026 14:24
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes DEVICE_BATCH_SIZE configurable via an environment variable (DEVICE_BATCH_SIZE), allowing users with smaller GPUs to reduce the per-device batch size without editing source code. Gradient accumulation steps automatically adjust to preserve TOTAL_BATCH_SIZE, as they are computed from DEVICE_BATCH_SIZE at line 496.

Changes:

  • DEVICE_BATCH_SIZE is now read from the environment using os.environ.get, falling back to 128 if unset.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Allow overriding DEVICE_BATCH_SIZE with an environment variable so users
with smaller GPUs can reduce it without editing source code, e.g.:

  DEVICE_BATCH_SIZE=16 uv run train.py

grad_accum_steps auto-adjusts to preserve TOTAL_BATCH_SIZE.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@0xmichalis 0xmichalis force-pushed the feat/configurable-batch-size branch from cf4fd31 to 5f1a1d6 Compare March 10, 2026 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants