Skip to content

feat(dashboard): surface prefill/decode power-per-GPU + J/Input (measured energy)#414

Merged
arygupt merged 2 commits into
masterfrom
feat/measured-power-metrics-display
Jun 1, 2026
Merged

feat(dashboard): surface prefill/decode power-per-GPU + J/Input (measured energy)#414
arygupt merged 2 commits into
masterfrom
feat/measured-power-metrics-display

Conversation

@arygupt
Copy link
Copy Markdown
Collaborator

@arygupt arygupt commented Jun 1, 2026

What

Adds Prefill Power/GPU, Decode Power/GPU, and J / Input token as selectable metrics in the Measured-Energy group (behind the existing konami gate), alongside the avg-power / J-out / J-total metrics already on master.

Re-land of #411 (closed unmerged) — cherry-picked cleanly onto a fresh branch off master:

  • d63103c feat(dashboard): add prefill/decode/J-input as measured-energy metrics
  • 3f2399c style(chart-utils): satisfy oxfmt format check

Why

The per-stage measured power these metrics read is now live in prod. The multinode disagg rows — gb300 DeepSeek-V4-Pro and mi355x MiniMax-M2.5 — carry prefill_avg_power_w / decode_avg_power_w / joules_per_input_token as of today's ingest (run 26607091549). This PR is the display layer that surfaces them; without it the data is queryable but not selectable in the chart/table.

Scope / safety

Test plan

  • CI green (lint / typecheck / unit / build)
  • Metrics appear in the Measured-Energy dropdown; selecting Prefill/Decode Power renders the table column
  • Local: tsc clean, unit tests green

🤖 Generated with Claude Code


Note

Low Risk
Pure frontend chart/config and type wiring with optional fields and existing konami gate; no API, ingest, or auth changes.

Overview
Surfaces per-stage measured energy in the inference dashboard: prefill/decode average power per GPU and measured J per input token, alongside the existing cluster-average power and J/out/J/total metrics.

Chart points now map runner fields prefill_avg_power_w, decode_avg_power_w, and joules_per_input_token into optional measuredPrefillAvgPower, measuredDecodeAvgPower, and measuredJPerInputToken (same typeof === 'number' gating as other telemetry so legacy rows stay absent, not zero). Wiring is mirrored in createChartDataPoint, trend interpolation (useInterpolatedTrendData), inference-chart-config.json (interactivity + e2e), roofline unions, and the Measured Energy Y-axis group in ChartControls (still feature-gated).

Unit tests cover independent prefill/decode emission, legacy omission, and zero-W preservation.

Reviewed by Cursor Bugbot for commit 3f2399c. Bugbot is set up for automated code reviews on this repo. Configure here.

arygupt and others added 2 commits June 1, 2026 13:03
Wire the per-stage measured-power telemetry the runner already emits
(prefill_avg_power_w, decode_avg_power_w, joules_per_input_token) as three
new selectable Y-axis metrics, mirroring the existing measuredAvgPower trio:

- Measured Prefill Power per GPU (W)
- Measured Decode Power per GPU (W)
- Measured J per Input Token (J/tok)

Added across both chart configs (interactivity + e2e), Y_AXIS_METRICS,
YAxisMetricKey, ChartDefinition, InferenceData, createChartDataPoint, the
roofline machinery (type unions / roof-reset / markRooflinePoints), the
lightweight trend-point builder, and the gated "Measured Energy" dropdown
group (stays behind the existing feature gate).

Purely additive — source fields already exist on AggDataEntry; no runner or
DB change. Closes the gap where disagg per-stage power was ingested but not
renderable. Validated against GB300 disagg data (run 26607091549).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Collapse two else-if conditions that fit on one line; oxfmt --check
flagged chart-utils.ts in CI (oxc job). No logic change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@arygupt arygupt requested a review from adibarra as a code owner June 1, 2026 20:23
@vercel
Copy link
Copy Markdown

vercel Bot commented Jun 1, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
inferencemax-app Ready Ready Preview, Comment Jun 1, 2026 8:24pm

Request Review

@claude
Copy link
Copy Markdown

claude Bot commented Jun 1, 2026

Claude finished @arygupt's task in 2m 18s —— View job


Review complete

🟢 LGTM — no blocking issues found

What I checked

  • chart-utils.ts — all 4 metric registration sites updated (Y_AXIS_METRICS, createChartDataPoint, calculateRoofline/computeAllRooflines type unions, markRooflinePoints reset + assign)
  • types.tsInferenceData, YAxisMetricKey, and ChartDefinition all extended symmetrically
  • inference-chart-config.json — entries added to both e2e and interactivity chart definitions, with roofline directions for JPerInputToken matching the JPerOutputToken pattern (lower_right / lower_left)
  • ChartControls.tsx — new metrics added to the Measured-Energy group (behind the existing gated: true konami toggle)
  • useInterpolatedTrendData.tsrowToLightweightPoint mirrors the new createChartDataPoint branches
  • Tests — 7 new unit tests cover each field independently, the legacy-row omission case, zero-value preservation (vs. falsy coercion), and a combined full-row check
  • Overlay path covered: buildChartDatatransformBenchmarkRowsrowToAggDataEntry (already carries prefill_avg_power_w / decode_avg_power_w / joules_per_input_token) → createChartDataPoint. No overlay-specific code branch needed for this kind of metric-registration-only change.

Notes

  • The two new per-GPU power metrics intentionally have no _roofline config key, matching measuredAvgPower — power-per-GPU has no meaningful frontier. computeAllRooflines guards on actualDataYKey && rooflineDirection, so this skips cleanly.
  • benchmark-transform.test.ts already has coverage for the BenchmarkRow → AggDataEntry side of these three fields; combined with the new chart-utils tests, both transform stages are tested.
  • No new interactive elements introduced — the existing metric-dropdown change tracking already fires when these options are selected, so no new track() call is needed.

@arygupt arygupt merged commit 9abb3d3 into master Jun 1, 2026
18 checks passed
@arygupt arygupt deleted the feat/measured-power-metrics-display branch June 1, 2026 22:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants