Skip to content

feat: add DualPipe bidirectional pipeline schedule#1157

Open
lishuangyuly wants to merge 1 commit intoflagos-ai:mainfrom
lishuangyuly:dualpipe
Open

feat: add DualPipe bidirectional pipeline schedule#1157
lishuangyuly wants to merge 1 commit intoflagos-ai:mainfrom
lishuangyuly:dualpipe

Conversation

@lishuangyuly
Copy link
Copy Markdown

FlagScale lacked support for the DualPipe bidirectional pipeline schedule introduced in DeepSeek-V3, which overlaps forward/backward computation across both pipeline directions to reduce bubble ratio compared to standard 1F1B.

New file: dualpipe_schedule.py

Implements the 8-step DualPipe algorithm as a drop-in replacement for Megatron's get_forward_backward_func() output:

  • WeightGradStore – defers weight-gradient computation to dedicated "W" steps, enabling zero-bubble scheduling
  • _split_data_iterator – pre-buffers and splits a single data iterator into two halves (one per pipeline direction) without upstream data-pipeline changes
  • forward_backward_dualpipe() – the schedule itself; same keyword interface as Megatron's built-in schedules

Model building (training.py)

When --use-dualpipe is active each rank is allocated two model chunks:

  • chunk[0] at pipeline position pp_rank (forward direction, rank 0 → N-1)
  • chunk[1] at pipeline position N-1-pp_rank (mirror direction, rank N-1 → 0)

A new _fs_get_forward_backward_func() helper selects the DualPipe schedule or falls back to Megatron's standard selector — no change in behaviour when the flag is absent.

Configuration flag (arguments_fs.py)

--use-dualpipe with validation:

  • Even pipeline_model_parallel_size > 1
  • Incompatible with VPP (--num-layers-per-virtual-pipeline-stage) and --use-dualpipev
  • Requires --untie-embeddings-and-output-weights
  • Even num_microbatchespipeline_model_parallel_size × 2

Usage

training:
  pipeline_model_parallel_size: 8   # must be even
  use_dualpipe: true
  untie_embeddings_and_output_weights: true
  global_batch_size: 512            # num_microbatches must be even and >= pp_size*2

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants