dsv4-fp4-b300-sglang: update image to nightly by yhyang201 · Pull Request #1506 · SemiAnalysisAI/InferenceX

yhyang201 · 2026-05-18T18:17:28Z

Summary

Update image from deepseek-v4-b300@sha256:2fec8d... to nightly-dev-cu13-20260518-c67b2870
Refactor benchmark script to dispatch by CONC instead of nested DP_ATTENTION/CONC/EP_SIZE
Switch high-concurrency profiles (CONC 2048/4096/8192) from --moe-a2a-backend deepep to megamoe
Remove env vars deleted from sglang main or redundant with defaults
Remove --deepep-config (not needed by megamoe)
Fix CONC=512 yaml ep: 4 → ep: 1 (flashinfer_mxfp4 doesn't set ep=tp)

Note

Low Risk
Benchmark and CI config only—no production serving or auth paths; risk is mis-tuned launch flags affecting perf numbers.

Overview
Updates dsv4-fp4-b300-sglang to a newer SGLang nightly image and aligns the B300 DeepSeek-V4-Pro benchmark with megamoe for high concurrency.

The nvidia-master config drops the B200-recipe note, documents CONC-based recipes (TP-only / DP+flashinfer / DP+megamoe), and fixes the CONC=512 search point from ep: 4 to ep: 1 so result names match flashinfer_mxfp4 (no implicit ep=tp).

dsv4_fp4_b300_sglang.sh is refactored to choose launch settings by CONC only (no nested DP_ATTENTION / EP_SIZE branches). 2048 / 4096 / 8192 now use --moe-a2a-backend megamoe instead of deepep; 512 stays DP-attn + flashinfer_mxfp4. Several SGLANG_OPT_* env vars and --deepep-config are removed as obsolete or megamoe-handled. pip install --upgrade transformers is added before serving benchmarks. perf-changelog.yaml records the change.

^{Reviewed by Cursor Bugbot for commit 0ba92fd. Bugbot is set up for automated code reviews on this repo. Configure here.}

github-actions · 2026-05-18T18:17:47Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

github-actions · 2026-05-18T18:17:48Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

github-actions · 2026-05-19T16:07:44Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26109529858
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26109529858

github-actions · 2026-05-19T16:54:56Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26109534591
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26109534591

github-actions · 2026-05-21T12:32:07Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26221509538
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26221509538

github-actions · 2026-05-22T04:28:23Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26221509538
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26221509538

github-actions · 2026-05-29T19:46:14Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26658560606
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26658560606

…e0b, refactor script, switch to megamoe

github-actions · 2026-05-29T19:50:20Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26658745339
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26658745339

github-actions · 2026-05-30T03:02:53Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26658745339
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26658745339

yhyang201 requested a review from a team May 18, 2026 18:17

yhyang201 requested review from jgangani and kedarpotdar-nv as code owners May 18, 2026 18:17

github-project-automation Bot added this to InferenceMAX Board May 18, 2026

yhyang201 changed the title ~~dsv4-fp4-b300-sglang: update image to nightly, switch to megamoe~~ dsv4-fp4-b300-sglang: update image to nightly May 18, 2026

yhyang201 force-pushed the yyh/update-dsv4-b300-sglang-image branch from f25519e to cf36b0c Compare May 19, 2026 15:32

yhyang201 added the full-sweep-enabled label May 19, 2026

yhyang201 force-pushed the yyh/update-dsv4-b300-sglang-image branch from d8ca8a8 to 09875d7 Compare May 21, 2026 10:52

yhyang201 added a commit that referenced this pull request May 29, 2026

Append perf-changelog entry for PR #1506

cfa7211

yhyang201 force-pushed the yyh/update-dsv4-b300-sglang-image branch from 09875d7 to cfa7211 Compare May 29, 2026 19:45

yhyang201 added 2 commits May 30, 2026 03:49

dsv4-fp4-b300-sglang: update image to nightly-dev-cu13-20260529-a8cfa…

f593147

…e0b, refactor script, switch to megamoe

Append perf-changelog entry for PR #1506

0ba92fd

yhyang201 force-pushed the yyh/update-dsv4-b300-sglang-image branch from cfa7211 to 0ba92fd Compare May 29, 2026 19:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dsv4-fp4-b300-sglang: update image to nightly#1506

dsv4-fp4-b300-sglang: update image to nightly#1506
yhyang201 wants to merge 2 commits into
mainfrom
yyh/update-dsv4-b300-sglang-image

yhyang201 commented May 18, 2026 •

edited by cursor Bot

Loading

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

github-actions Bot commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yhyang201 commented May 18, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

github-actions Bot commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yhyang201 commented May 18, 2026 •

edited by cursor Bot

Loading