feat(dashboard): render per-prefill / per-decode / total measured power by arygupt · Pull Request #411 · SemiAnalysisAI/InferenceX-app

arygupt · 2026-06-01T03:09:20Z

What

Adds three new selectable Y-axis metrics to the inference chart, surfacing the per-stage measured-power telemetry the runner already emits but the dashboard could not render:

New metric	Source field (already on `AggDataEntry`)	Mirrors
Measured Prefill Power per GPU (W)	`prefill_avg_power_w`	`measuredAvgPower`
Measured Decode Power per GPU (W)	`decode_avg_power_w`	`measuredAvgPower`
Measured J per Input Token (J/tok)	`joules_per_input_token`	`measuredJPerOutputToken`

Net result in the gated "Measured Energy" dropdown: prefill / decode / total power + J input / output / total — i.e. the per-prefill/per-decode/per-total split for disaggregated runs.

Why

The disagg per-stage power data flows all the way through the pipeline (DB metrics JSONB + unofficial-run artifact path → rowToAggDataEntry → chart point) but only three measured metrics were ever wired as selectable Y-axis options (total power, J/output, J/total). The per-stage fields existed on the row but were never wrapped into metrics. This closes that gap.

How

Purely additive / mechanical — copies the existing measuredAvgPower trio pattern across every site:

inference-chart-config.json (both interactivity + e2e charts; J/input roofline lower_right/lower_left to match J/output)
Y_AXIS_METRICS, YAxisMetricKey, ChartDefinition, InferenceData (types)
createChartDataPoint + roofline machinery (both type unions, roof-reset, markRooflinePoints)
useInterpolatedTrendData lightweight trend-point builder
ChartControls gated "Measured Energy" group (stays gated: true)
chart-utils.test.ts — 7 new cases (emit / omit-legacy / zero-preservation / per-stage independence / full-disagg row)

No runner, ETL, or DB change — source fields already exist on AggDataEntry; packages/constants already lists the keys.

Still gated

The metrics remain behind the existing ↑ ↑ ↓ ↓ feature gate, so merging does not expose measured power publicly. Ungating is a separate, deliberate product decision (follow-up).

How to preview (no DB needed)

On the Vercel preview this PR posts:

Open …/inference?unofficialrun=26607091549 (case-insensitive; fetches the GB300 run's GitHub artifacts directly — needs GITHUB_TOKEN in the preview env, which already powers the PR unofficial-run visualizer).
Press ↑ ↑ ↓ ↓ to unlock "Measured Energy".
In the Y-axis dropdown pick Measured Prefill Power per GPU, then Measured Decode Power per GPU, then Measured Average Power per GPU; filter to the dsv4/gb300 points → screenshot each. Expect prefill > decode per-GPU watts (compute- vs memory-bound).

Verification

tsc --noEmit — clean (the Y_AXIS_METRICS/YAxisMetricKey/ChartDefinition/InferenceData sites all agree).
Full app unit suite — 1996 passed (incl. 7 new cases).
oxlint on changed files — clean.

Out of scope (follow-ups)

Combined prefill+decode+total overlay as one multi-series chart.
Per-worker workers[] visualization (data is carried to the point, not yet rendered).
Ungating measured power for the public site.

🤖 Generated with Claude Code

Note

Low Risk
Additive frontend chart and type wiring with optional-field guards; no API, auth, or data-pipeline changes.

Overview
Adds three selectable measured-energy Y-axis metrics on the inference dashboard—prefill power, decode power, and J per input token—so disaggregated runs can show per-stage telemetry that already exists on benchmark rows but was not chartable.

The change mirrors the existing total measured-power pattern: map prefill_avg_power_w, decode_avg_power_w, and joules_per_input_token into chart points in createChartDataPoint and trend interpolation, register them in chart config and Y_AXIS_METRICS, extend roofline marking, and list them under the gated Measured Energy dropdown (prefill/decode/total power plus input/output/total J). Legacy rows without per-stage fields stay omitted (not zero-filled). Unit tests cover emit, omit, zero preservation, and full disagg rows.

No runner, DB, or ETL changes; the feature gate for measured energy is unchanged.

^{Reviewed by Cursor Bugbot for commit 0e85ae7. Bugbot is set up for automated code reviews on this repo. Configure here.}

Wire the per-stage measured-power telemetry the runner already emits (prefill_avg_power_w, decode_avg_power_w, joules_per_input_token) as three new selectable Y-axis metrics, mirroring the existing measuredAvgPower trio: - Measured Prefill Power per GPU (W) - Measured Decode Power per GPU (W) - Measured J per Input Token (J/tok) Added across both chart configs (interactivity + e2e), Y_AXIS_METRICS, YAxisMetricKey, ChartDefinition, InferenceData, createChartDataPoint, the roofline machinery (type unions / roof-reset / markRooflinePoints), the lightweight trend-point builder, and the gated "Measured Energy" dropdown group (stays behind the existing feature gate). Purely additive — source fields already exist on AggDataEntry; no runner or DB change. Closes the gap where disagg per-stage power was ingested but not renderable. Validated against GB300 disagg data (run 26607091549). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

vercel · 2026-06-01T03:09:26Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
inferencemax-app	Ready	Preview, Comment	Jun 1, 2026 3:09am

claude · 2026-06-01T03:09:38Z

Claude finished @arygupt's task in 2m 40s —— View job

🟢 LGTM — no blocking issues found

This is a clean, mechanical extension of the existing measured-power trio. I verified every sync point ends up consistent:

Sync points checked (all 11 in agreement):

Y_AXIS_METRICS + YAxisMetric type (chart-utils.ts:151-160)
YAxisMetricKey (types.ts:265-270)
ChartDefinition field set (types.ts:379-394)
InferenceData wrapper fields (types.ts:233-238)
createChartDataPoint source-field guards (chart-utils.ts:407-422)
calculateRoofline / computeAllRooflines union types (chart-utils.ts:581-663)
markRooflinePoints reset block + if/else chain (chart-utils.ts:706-795)
useInterpolatedTrendData lightweight builder (useInterpolatedTrendData.ts:83-91)
inference-chart-config.json (both interactivity + e2e blocks; J/input roofline = lower_right / lower_left to mirror J/output)
ChartControls.tsx Measured Energy group (gated: true preserved)
Tests: 7 new cases in chart-utils.test.ts:1297-1369 cover emit / omit-legacy / zero-preservation / per-stage independence / full-disagg row

Unofficial-run path: Source fields (prefill_avg_power_w, decode_avg_power_w, joules_per_input_token) are already mapped by benchmark-transform.ts:61-63 (covered by benchmark-transform.test.ts), and the unofficial-run provider uses the same transformBenchmarkRows. So overlay rooflines/points pick up the new metrics for free — overlay support is structurally present.

typeof === 'number' guards preserve 0 W values (explicitly asserted by the new "preserves a zero per-stage power value" test).

Feature gate stays in place — merging does not expose measured power publicly.

Minor pre-existing observation (not introduced by this PR, not blocking): chart-display-helpers.tsx:117 uses startsWith('y_j') to gate the Joules disagg caveat, which doesn't match any y_measured* metric — including the existing y_measuredJPerOutputToken/y_measuredJPerTotalToken. Since the new y_measuredJPerInputToken follows the same pre-existing pattern, there's no new regression here. If/when the Measured Energy group is ungated, that's the place where a y_measured caveat branch would need to be added.

Read changed files
Verify mechanical-copy claim across 11 sync points
Check unofficial-run path coverage
Check tests (7 new cases)
Post review
• Branch: feat/dashboard-prefill-decode-power

arygupt requested a review from adibarra as a code owner June 1, 2026 03:09

vercel Bot deployed to Preview June 1, 2026 03:09 View deployment

arygupt closed this Jun 1, 2026

arygupt deleted the feat/dashboard-prefill-decode-power branch June 1, 2026 03:21

arygupt mentioned this pull request Jun 1, 2026

feat(dashboard): surface prefill/decode power-per-GPU + J/Input (measured energy) #414

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(dashboard): render per-prefill / per-decode / total measured power#411

feat(dashboard): render per-prefill / per-decode / total measured power#411
arygupt wants to merge 1 commit into
masterfrom
feat/dashboard-prefill-decode-power

arygupt commented Jun 1, 2026 •

edited by cursor Bot

Loading

Uh oh!

vercel Bot commented Jun 1, 2026 •

edited

Loading

Uh oh!

claude Bot commented Jun 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

arygupt commented Jun 1, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

How

Still gated

How to preview (no DB needed)

Verification

Out of scope (follow-ups)

Uh oh!

vercel Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

claude Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🟢 LGTM — no blocking issues found

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

arygupt commented Jun 1, 2026 •

edited by cursor Bot

Loading

vercel Bot commented Jun 1, 2026 •

edited

Loading

claude Bot commented Jun 1, 2026 •

edited

Loading