Skip to content

Conversation

HAOCHENYE
Copy link
Collaborator

@HAOCHENYE HAOCHENYE commented Sep 25, 2025

  • Rename intra_layer_micro_batch parameter to packed_samples_per_forward across all files
  • Add domino_forward boolean flag for controlling forward mode
  • Update test scripts to use new parameter names
  • Add packing support for sequence contexts and loss contexts
  • Update CELossContext to CELossConfig in test scripts

@HAOCHENYE HAOCHENYE force-pushed the yehc/rename-intralayer branch from 1690940 to 1409594 Compare September 25, 2025 19:21
…rd and add domino_forward flag

- Rename intra_layer_micro_batch parameter to packed_samples_per_forward across all files
- Add domino_forward boolean flag for controlling forward mode
- Update test scripts to use new parameter names
- Add packing support for sequence contexts and loss contexts
- Update CELossContext to CELossConfig in test scripts
@HAOCHENYE HAOCHENYE force-pushed the yehc/rename-intralayer branch 2 times, most recently from bda4536 to ad6b6e0 Compare September 26, 2025 08:20
profile_time: bool = True
profile_memory: bool = False
intra_layer_micro_batch: int = 1
packed_samples_per_forward: int = 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以命名为 micro_batch_size ,跟业界叫法一致,即有 global_batch_size = micro_batch_size x micro_batch_num(grad_accumulate_num) x dp_size 。
原有代码中的 micro_batch_size 可以改名为 batch_size_per_gpu(或 batch_size_per_rank),即有 batch_size_per_gpu = micro_batch_size x micro_batch_num

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants