Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .github/configs/nvidia-master.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2208,7 +2208,7 @@ qwen3.5-fp4-b200-sglang-mtp:
- { tp: 2, ep: 1, conc-start: 4, conc-end: 64, spec-decoding: mtp }

glm5-fp8-b200-sglang:
image: lmsysorg/sglang:v0.5.12-cu130
image: lmsysorg/sglang:nightly-dev-cu13-20260523-c112f762
model: zai-org/GLM-5-FP8
model-prefix: glm5
runner: b200
Expand All @@ -2227,7 +2227,7 @@ glm5-fp8-b200-sglang:
- { tp: 8, ep: 1, conc-start: 4, conc-end: 256 }

glm5-fp8-b200-sglang-mtp:
image: lmsysorg/sglang:v0.5.12-cu130
image: lmsysorg/sglang:nightly-dev-cu13-20260523-c112f762
model: zai-org/GLM-5-FP8
model-prefix: glm5
runner: b200
Expand Down Expand Up @@ -2307,7 +2307,7 @@ glm5-fp8-b300-sglang-mtp:
- { tp: 8, ep: 1, conc-start: 4, conc-end: 256, spec-decoding: mtp }

glm5-fp4-b200-sglang:
image: lmsysorg/sglang:v0.5.12-cu130
image: lmsysorg/sglang:nightly-dev-cu13-20260523-c112f762
model: nvidia/GLM-5-NVFP4
model-prefix: glm5
runner: b200
Expand All @@ -2328,7 +2328,7 @@ glm5-fp4-b200-sglang:
- { tp: 4, ep: 1, conc-start: 4, conc-end: 256 }

glm5-fp4-b200-sglang-mtp:
image: lmsysorg/sglang:v0.5.12-cu130
image: lmsysorg/sglang:nightly-dev-cu13-20260523-c112f762
model: nvidia/GLM-5-NVFP4
model-prefix: glm5
runner: b200
Expand Down
24 changes: 24 additions & 0 deletions perf-changelog.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3129,3 +3129,27 @@
description:
- "Add --use-chat-template to run_benchmark_serving so prompts are formatted with the Qwen chat template (matching the other Qwen MTP recipes)"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1555

- config-keys:
- glm5-fp4-b200-sglang
description:
- "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561

- config-keys:
- glm5-fp4-b200-sglang-mtp
description:
- "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561

- config-keys:
- glm5-fp8-b200-sglang
description:
- "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561

- config-keys:
- glm5-fp8-b200-sglang-mtp
description:
- "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561

Check warning on line 3155 in perf-changelog.yaml

View check run for this annotation

Claude / Claude Code Review

Stale perf-changelog entries: wrong baseline version and pr-link

The four new perf-changelog entries (lines 3133-3155) carry stale info from the abandoned precursor PR #1561: (1) the description says 'from v0.5.11-cu130' but the diff at nvidia-master.yaml lines 2210/2229/2309/2330 shows the prior image was v0.5.12-cu130 (off by one minor version), and (2) the pr-link points to #1561 (the precursor) rather than this PR (#1567) which actually lands the change. Both are documentation-only nits, but should be corrected before merge — update the baseline to v0.5.1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The four new perf-changelog entries (lines 3133-3155) carry stale info from the abandoned precursor PR #1561: (1) the description says 'from v0.5.11-cu130' but the diff at nvidia-master.yaml lines 2210/2229/2309/2330 shows the prior image was v0.5.12-cu130 (off by one minor version), and (2) the pr-link points to #1561 (the precursor) rather than this PR (#1567) which actually lands the change. Both are documentation-only nits, but should be corrected before merge — update the baseline to v0.5.12-cu130 and the pr-link to #1567 so future readers can trace the actual delta and merge commit.

Extended reasoning...

What the bug is

This PR adds four new entries to perf-changelog.yaml (lines 3133-3155), one for each glm5 b200 sglang recipe whose image is being bumped. Each entry contains two pieces of stale information copied from the abandoned precursor PR #1561:

  1. Wrong baseline version in description. All four entries read "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762". But the actual previous image — visible in the - lines of the diff at .github/configs/nvidia-master.yaml lines 2210, 2229, 2309, 2330 — was lmsysorg/sglang:v0.5.12-cu130, not v0.5.11. The PR description itself acknowledges this: it says the bump is from lmsysorg/sglang:v0.5.12-cu130, and the Cursor-bot summary embedded in the PR body explicitly calls out that "changelog text references v0.5.11-cu130 as the prior baseline".

  2. Wrong pr-link. All four entries set pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561. But per the PR description, this PR "Mirrors Update glm-5 container to use SGLang latest #1561 (xinli-sw:glm-update) re-based on current main" — so Update glm-5 container to use SGLang latest #1561 is the abandoned precursor and Update glm-5 b200 sglang image to nightly-dev-cu13-20260523-c112f762 #1567 is the actual PR that will land. The convention elsewhere in perf-changelog.yaml is that pr-link points to the PR that actually introduces the change (e.g. the immediately preceding entries at lines 3107, 3115, 3125, 3131 link to [Klaud Cold] qwen3.5-fp8-mi355x-atom-mtp: enable --use-chat-template #1555, [NV] update Minimax2.5 fp8 h100 vllm #1516, etc., matching the merge commits in the recent git log).

Why these are both stale-from-rebase artifacts

Git log confirms commit 8e0f658 (PR #1447) already bumped these four recipes from v0.5.11 to v0.5.12 prior to this PR. So the changelog text "from v0.5.11" was accurate at the time #1561 was first authored, but became stale once #1561 was rebased onto current main (where v0.5.12 was already in place) and resubmitted as #1567. The pr-link similarly carries the original PR number, not the rebase-mirror PR number.

Step-by-step proof

  1. Open the PR diff for .github/configs/nvidia-master.yaml. At line 2210 the removed line is - image: lmsysorg/sglang:v0.5.12-cu130 (same at 2229, 2309, 2330). So the actual prior image is v0.5.12-cu130.
  2. Open the PR diff for perf-changelog.yaml. Lines 3136, 3142, 3148, 3154 all say "Update SGLang image from v0.5.11-cu130 to nightly-dev-cu13-20260523-c112f762". Compare to step 1: v0.5.11 ≠ v0.5.12.
  3. Lines 3137, 3143, 3149, 3155 all set pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1561. The PR being reviewed is Update glm-5 b200 sglang image to nightly-dev-cu13-20260523-c112f762 #1567, which per its own description "Mirrors Update glm-5 container to use SGLang latest #1561 ... re-based on current main".
  4. Scanning the immediately preceding changelog entries (lines 3107, 3115, 3125, 3131), pr-links are 1555, 1516, etc., each matching the PR that actually introduced the change (verifiable via git log against commits d4948f9 and 298d8f9).

Impact

Documentation-only. No runtime effect. The cost is purely traceability: anyone reading perf-changelog.yaml in the future to understand the v0.5.12→nightly delta will (a) see the wrong starting version, and (b) follow the pr-link into a closed, abandoned PR rather than the merged commit.

Fix

In the four new entries in perf-changelog.yaml, change:

  • description: v0.5.11-cu130v0.5.12-cu130
  • pr-link: /pull/1561/pull/1567

Loading