[AMD][MI35X] 0526 DSV4 by 1am9trash · Pull Request #1568 · SemiAnalysisAI/InferenceX

1am9trash · 2026-05-27T02:14:34Z

Bump docker image to rocm/sgl-dev:rocm720-mi35x-f96ac98-20260526-DSv4

Note

Medium Risk
Changes MoRI combine dtypes and DP+EP serving limits that affect benchmark accuracy and high-concurrency behavior, though scope is limited to AMD disagg benchmark configs and launch scripts.

Overview
Updates DeepSeek R1 FP4 disaggregated SGLang benchmarks on MI355X to a newer May 19 lmsysorg/sglang-rocm image and retunes MoRI/serving so FP8 blockwise combine and high-concurrency DP+EP sweeps behave correctly.

CI config (amd-master.yaml): Both dsr1-fp4-mi355x-sglang-disagg and -mtp move from v0.5.10.post1 to v0.5.12 (20260519). Non-MTP 8192 ISL adds a 1×DEP8 + 1×DEP8 point at conc 128/256/512. MTP 8192 replaces the old 1P2D TP4 layout with 1×DEP8 + 1×DEP8 (conc 128/512 and 64/256, MTP size 1), and tightens one existing MTP DEP8 sweep.

Runtime (env.sh / server.sh): Replaces SGLANG_MORI_FP8_COMB=true with separate prefill/decode combine dtypes (fp8_direct_cast / fp8), passes them as SGLANG_MORI_COMBINE_DTYPE on launch, and turns overlap plan stream off. When prefill or decode has both DP and EP, server.sh now drives max-running-requests and MoRI token limits from the peak x-delimited bench concurrency (including 512-class sweeps). Stale eval comments about broken FP8 combine are removed.

perf-changelog.yaml: Documents the image bump, eval fix, and new conc 512 sweep for those two config keys.

^{Reviewed by Cursor Bugbot for commit efe0c03. Bugbot is set up for automated code reviews on this repo. Configure here.}

github-actions · 2026-05-27T02:14:44Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

claude

LGTM, straightforward image bump + concurrency widening for the DSv4 FP4 MI355X SGLang benchmark recipe.

Extended reasoning...

Overview

This PR bumps the docker image for the dsv4-fp4-mi355x-sglang benchmark from rocm/sgl-dev:rocm720-mi35x-8c3b5aa-20260521-DSv4 to ...-f96ac98-20260526-DSv4, widens the DP-attention concurrency sweep upper bounds (1024→4096 for 1024/1024, 512→2048 for 8192/1024), and adds --mem-fraction-static 0.90 and --swa-full-tokens-ratio 0.1 to the SGLang launch invocation. Only two files are touched, both benchmark/CI scaffolding.

Security risks

None. This PR touches no production app code, no auth/crypto/permissions, and no data paths. The changes affect only benchmark sweeps executed in CI runners.

Level of scrutiny

Low — this is exactly the kind of mechanical recipe bump that does not require human review. The image tag change is opaque (image content is upstream), the new flags are well-known SGLang server tuning knobs to match the newer image's memory/SWA behavior, and the only operational risk (longer sweeps, more GPU time, possible OOM at the higher conc) is contained to benchmark jobs, not production.

Other factors

No bugs were flagged by the bug-hunting system. The PR description (and Cursor Bugbot's auto-summary) explains the motivation. No outstanding reviewer comments to address — only the standard recipe-reminder bot comments.

Update config

fbddaa3

1am9trash requested a review from a team May 27, 2026 02:14

1am9trash requested review from billishyahao, chunfangamd and seungrokj as code owners May 27, 2026 02:14

github-project-automation Bot added this to InferenceMAX Board May 27, 2026

1am9trash requested a review from yctseng0211 as a code owner May 27, 2026 02:14

This comment was marked as duplicate.

Sign in to view

claude Bot reviewed May 27, 2026

View reviewed changes

1am9trash added AMD full-sweep-enabled labels May 27, 2026

1am9trash and others added 2 commits May 27, 2026 10:24

Update change log

f3870db

Merge branch 'main' into dsv4-mi355-sgl-0526

a2c9b94

This comment was marked as outdated.

Sign in to view

1am9trash added 3 commits May 27, 2026 10:29

Update change log

11daefb

Update config

44f19c0

Update config

a618b9e

This comment was marked as outdated.

Sign in to view

Merge remote-tracking branch 'origin/main' into dsv4-mi355-sgl-0526

efe0c03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMD][MI35X] 0526 DSV4#1568

[AMD][MI35X] 0526 DSV4#1568
1am9trash wants to merge 7 commits into
mainfrom
dsv4-mi355-sgl-0526

1am9trash commented May 27, 2026 •

edited by cursor Bot

Loading

Uh oh!

This comment was marked as duplicate.

github-actions Bot commented May 27, 2026

Uh oh!

claude Bot left a comment

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

1am9trash commented May 27, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as duplicate.

github-actions Bot commented May 27, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1am9trash commented May 27, 2026 •

edited by cursor Bot

Loading