feat(dashboard): render per-prefill / per-decode / total measured power#411
feat(dashboard): render per-prefill / per-decode / total measured power#411arygupt wants to merge 1 commit into
Conversation
Wire the per-stage measured-power telemetry the runner already emits (prefill_avg_power_w, decode_avg_power_w, joules_per_input_token) as three new selectable Y-axis metrics, mirroring the existing measuredAvgPower trio: - Measured Prefill Power per GPU (W) - Measured Decode Power per GPU (W) - Measured J per Input Token (J/tok) Added across both chart configs (interactivity + e2e), Y_AXIS_METRICS, YAxisMetricKey, ChartDefinition, InferenceData, createChartDataPoint, the roofline machinery (type unions / roof-reset / markRooflinePoints), the lightweight trend-point builder, and the gated "Measured Energy" dropdown group (stays behind the existing feature gate). Purely additive — source fields already exist on AggDataEntry; no runner or DB change. Closes the gap where disagg per-stage power was ingested but not renderable. Validated against GB300 disagg data (run 26607091549). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Claude finished @arygupt's task in 2m 40s —— View job 🟢 LGTM — no blocking issues foundThis is a clean, mechanical extension of the existing measured-power trio. I verified every sync point ends up consistent: Sync points checked (all 11 in agreement):
Unofficial-run path: Source fields (
Feature gate stays in place — merging does not expose measured power publicly. Minor pre-existing observation (not introduced by this PR, not blocking):
|
What
Adds three new selectable Y-axis metrics to the inference chart, surfacing the per-stage measured-power telemetry the runner already emits but the dashboard could not render:
AggDataEntry)prefill_avg_power_wmeasuredAvgPowerdecode_avg_power_wmeasuredAvgPowerjoules_per_input_tokenmeasuredJPerOutputTokenNet result in the gated "Measured Energy" dropdown: prefill / decode / total power + J input / output / total — i.e. the per-prefill/per-decode/per-total split for disaggregated runs.
Why
The disagg per-stage power data flows all the way through the pipeline (DB
metricsJSONB + unofficial-run artifact path →rowToAggDataEntry→ chart point) but only three measured metrics were ever wired as selectable Y-axis options (total power, J/output, J/total). The per-stage fields existed on the row but were never wrapped into metrics. This closes that gap.How
Purely additive / mechanical — copies the existing
measuredAvgPowertrio pattern across every site:inference-chart-config.json(both interactivity + e2e charts; J/input rooflinelower_right/lower_leftto match J/output)Y_AXIS_METRICS,YAxisMetricKey,ChartDefinition,InferenceData(types)createChartDataPoint+ roofline machinery (both type unions, roof-reset,markRooflinePoints)useInterpolatedTrendDatalightweight trend-point builderChartControlsgated "Measured Energy" group (staysgated: true)chart-utils.test.ts— 7 new cases (emit / omit-legacy / zero-preservation / per-stage independence / full-disagg row)No runner, ETL, or DB change — source fields already exist on
AggDataEntry;packages/constantsalready lists the keys.Still gated
The metrics remain behind the existing
↑ ↑ ↓ ↓feature gate, so merging does not expose measured power publicly. Ungating is a separate, deliberate product decision (follow-up).How to preview (no DB needed)
On the Vercel preview this PR posts:
…/inference?unofficialrun=26607091549(case-insensitive; fetches the GB300 run's GitHub artifacts directly — needsGITHUB_TOKENin the preview env, which already powers the PR unofficial-run visualizer).dsv4/gb300points → screenshot each. Expect prefill > decode per-GPU watts (compute- vs memory-bound).Verification
tsc --noEmit— clean (theY_AXIS_METRICS/YAxisMetricKey/ChartDefinition/InferenceDatasites all agree).oxlinton changed files — clean.Out of scope (follow-ups)
workers[]visualization (data is carried to the point, not yet rendered).🤖 Generated with Claude Code
Note
Low Risk
Additive frontend chart and type wiring with optional-field guards; no API, auth, or data-pipeline changes.
Overview
Adds three selectable measured-energy Y-axis metrics on the inference dashboard—prefill power, decode power, and J per input token—so disaggregated runs can show per-stage telemetry that already exists on benchmark rows but was not chartable.
The change mirrors the existing total measured-power pattern: map
prefill_avg_power_w,decode_avg_power_w, andjoules_per_input_tokeninto chart points increateChartDataPointand trend interpolation, register them in chart config andY_AXIS_METRICS, extend roofline marking, and list them under the gated Measured Energy dropdown (prefill/decode/total power plus input/output/total J). Legacy rows without per-stage fields stay omitted (not zero-filled). Unit tests cover emit, omit, zero preservation, and full disagg rows.No runner, DB, or ETL changes; the feature gate for measured energy is unchanged.
Reviewed by Cursor Bugbot for commit 0e85ae7. Bugbot is set up for automated code reviews on this repo. Configure here.