feat(console): traces page on iii-browser-sdk + portability/quality pass#1649
Draft
andersonleal wants to merge 9 commits into
Draft
feat(console): traces page on iii-browser-sdk + portability/quality pass#1649andersonleal wants to merge 9 commits into
andersonleal wants to merge 9 commits into
Conversation
…-by, engine-routing, and JSON log controls
Move the console traces page off the HTTP transport and onto the
iii-browser-sdk WebSocket bridge (engine bridge_port, surfaced through
console-rust runtime config). This lands the migration end to end and
re-restores three UX pieces that were dropped on the floor along the
way.
Transport
- console-rust: surface `bridge_port` on `ServerConfig` and emit it as
`bridgePort` in the frontend runtime config payload.
- console-frontend: add `iii-browser-sdk` dep, `EngineSdkProvider`
context (StrictMode-safe via `useState` lazy initializer + effect
cleanup), `getEngineBridgeWs()` config helper, mount provider under
`ConfigProvider`.
- Rewrite `api/observability/traces.ts` on top of `ISdk.trigger` with
per-function timeouts, generic `stripUndefined<T>`, and a
memory-exporter-not-enabled fallback for list/tree (clear still
rethrows by design).
- Add vitest + 15 tests covering fetch/tree/clear/group_by success,
error, and symmetry paths.
- Update all 5 call sites (`useTraceData`, `api/queries`, route,
`SessionDetailPanel`) to thread the SDK through.
Group-by view (recovered from stash@{0})
- New `lib/groupTraces.ts`, `hooks/useTraceGroups.ts`,
`components/traces/TraceGroupsView.tsx`, and
`components/traces/SessionDetailPanel.tsx`.
- `useTraceFilters` gains `groupBy?: GroupByOption` (default `'none'`).
- `TraceFilters` exposes a Group By popover next to Attributes
(yellow accent when active).
- Traces route renders the groups view inside the existing list scroll
area and the session detail inside the existing right-side panel,
mirroring the SpanPanel pattern.
Waterfall engine-routing controls (recovered from stash@{0})
- New `lib/spanLabel.ts` (`formatSpanLabel`, `getSpanKindIndicator`,
`isEngineRoutingPair`, `isEngineRoutingSpan`).
- WaterfallChart toolbar: "Collapse routing pairs" and "Hide engine
routing" checkboxes, both persisted to localStorage. Row rendering
dims engine spans, swaps verbose kind badges for compact indicators,
and shows a `+1` chip for merged pairs.
Span logs (recovered from stash@{0})
- `SpanLogsTab` extracts `EventAttributeRow`; values that look like a
JSON object/array (`iii.payload.json` and similar) are parsed and
pretty-printed in a `<pre>` block with `whitespace-pre overflow-x-auto`.
Docs
- Spec: `docs/superpowers/specs/2026-05-15-traces-iii-browser-migration-design.md`
- Plan: `docs/superpowers/plans/2026-05-15-traces-iii-browser-migration.md`
Verified
- `pnpm exec tsc --noEmit` clean
- `pnpm test` 15/15 green
…aces page Acts on a code-reviewer (opus) audit + follow-up eng-review of the traces feature with a portability lens. Score before: 5.5/10 (forkable but needs surgery on color tokens and provider wiring). After: theme-aware components, an injectable SDK provider, a tighter error path, named feature types, and a documented engine contract. Hex sweep — 68 literals → semantic tokens - WaterfallChart: 27 → 1 (only `#F97316` critical-path orange remains; iii-specific signal, not a theme color, comment in place). - TraceFilters: 33 → 0. - SpanLogsTab: 6 → 0. - TraceGroupsView: 3 → 0. - Inline-style hex in `getBarStyle()`, the mini-map and the `statusColors` object now read from `var(--success)`, `var(--error)`, `var(--muted)` so the waterfall re-themes under `[data-theme="light"]`. - Accent-yellow box-shadows converted to `color-mix(in srgb, var(--accent) NN%, transparent)` so they re-theme alongside the rest of the palette. EngineSdkProvider injectable - Accepts `sdk?: ISdk` (consumer owns lifecycle, no implicit shutdown) and `bridgeUrl?: string` (skip the ConfigProvider chain). - Default behavior unchanged for the current console; unlocks Storybook stories, tests with mock SDKs, and embedding in different host apps. Engine contract made explicit - Export `TRACES_RPC_FUNCTIONS` const from `api/observability/traces.ts` with versioning JSDoc — single discoverable surface for the four engine function IDs a port has to wire up. Type collision fixed - Rename the local `TraceGroup` (per-trace view-model) in `hooks/useTraceData.ts` to `TraceListItem`. The server-side per-attribute aggregate keeps the name `TraceGroup`. Drops the `ApiTraceGroup` alias at the import site in routes/traces. Documentation - ASCII data-flow diagrams at the top of `WaterfallChart.tsx` and `api/observability/traces.ts` describing pipeline + error policy. Error-path hygiene - `isGroupByUnavailable` regex tightened to `/function[_ ]not[_ ]found/i`. The previous `/group_by|404/` alternatives would silently swallow unrelated errors that happened to contain those substrings. - Routes/traces.tsx: dropped two redundant `console.warn` calls (the UI set `spansError` immediately after). Kept the `console.error` for the catch path, but gated it behind `import.meta.env.DEV` so production bundles don't leak diagnostics. - `useTraceData` fingerprint hashes every trace_id instead of first/last/count — was a latent bug if sort order ever flipped. Tests - Added `lib/groupTraces.test.ts` (14 tests): `groupByLabel`, `summarizeGroup` (incl. m+s long-duration format), `groupHeading` (incl. wildcard sentinel rewrite), `isGroupByUnavailable` (incl. the regression case the tightened regex was designed to catch — silent swallow of `failed to group_by: timeout` and 404-in-message errors). Misc - Dropped `'use client'` from `TraceFilters`, `useTraceFilters`, and `AttributesFilter`. No-ops in this Vite project; misleading copy from a Next.js reference. Verified - `pnpm exec tsc --noEmit` clean - `pnpm test` 29/29 green
… helper + STATUS_CONFIG tokens
Three follow-up portability improvements after the prior portability
commit landed.
SessionDetailPanel: lazy fetch per card (A2)
- Previously: `useQueries` at panel level fired one `fetchTraceTree`
per trace_id immediately on mount. A 50-trace session opened 50
parallel WebSocket calls; a 100-trace session opened 100.
- Now: each `TraceCard` owns its own `useQuery`, gated by `enabled: open`.
The first card auto-opens (`defaultOpen={idx === 0}`) so the user
sees content immediately; subsequent cards stay idle until clicked.
TanStack Query's 30s staleTime keeps re-opens instant.
- Net: a session with N traces fires 1 network call on open + at most
K calls where K = traces the user actually clicks. The "open
everything at once" footgun is gone.
formatPossibleJson extracted to lib/
- The JSON pretty-printer used by SpanLogsTab moved from a private
helper in the component to `lib/formatPossibleJson.ts`. Reusable
across any other surface that wants the same "looks like JSON?
parse + pretty-print, else fall through" behavior.
- 10 unit tests cover: nested objects, arrays, whitespace handling,
non-string inputs, empty strings, non-JSON-prefix strings, malformed
JSON, and strict-parse on trailing commas. Pinned behavior.
STATUS_CONFIG → CSS vars
- `lib/traceUtils.ts` STATUS_CONFIG (used by `SpanPanel` for status
pills) had hex literals + rgba shades baked in. Replaced with
`var(--success/error/muted)` and `color-mix(in srgb, var(--token) 8%,
transparent)` for the subtle bg/border tints. The status pills now
re-theme correctly under `[data-theme="light"]` along with the rest
of the palette.
spanLabel tests added
- 21 tests across the four pure helpers in `lib/spanLabel.ts`:
`getSpanKindIndicator` (5 OTel kinds + case-insensitivity + null
cases), `formatSpanLabel` (verb-prefix strip, service-prefix strip,
ordering, defensive cases), `isEngineRoutingSpan`, and
`isEngineRoutingPair` (parent/child match, mismatches, non-engine
service guard).
- A port targeting a different trace producer can reuse the test shape
and swap the routing config.
Verified
- `pnpm exec tsc --noEmit` clean
- `pnpm test` 60/60 green (was 29/29 before this commit)
…sign tokens Continues the portability pass into the pre-existing traces components the initial sweep didn't touch. Total this commit: ~150 Tailwind-class hex literals converted to semantic tokens across 15 files. Files swept - AttributesFilter, FlowView, ServiceBreakdown, SpanBaggageTab, SpanErrorsTab, SpanInfoTab, SpanLinksTab, SpanOtelLogsTab, SpanPanel, SpanTagsTab, TraceHeader, TraceMap, ViewSwitcher, WorkflowChain, FlameGraph. Mappings applied - bg-/border-/text-/hover:/divide- with the same hex map used in the first portability commit (see commit 4ae493e). - `#3B82F6` (info blue) -> `bg-info` / `text-info` / `border-info` (exact match to `--info`). - Inline-style hex in `SEVERITY_STYLES.accent` (SpanOtelLogsTab) converted to `var(--error)`, `var(--info)`, `var(--muted)`. Debug row's `text-gray-400` / `bg-gray-500/10` swapped to `text-muted` / `bg-muted/10` for consistency. - `SpanInfoTab` accent gradient: `linear-gradient(90deg, #F3F724, #D4D520)` -> `var(--accent), var(--accent-hover)`. - `FlowView` service-color fallback (`?? '#666666'`) -> `var(--muted)`. - `FlowView` accent ring glow: `shadow-[...rgba(243,247,36,...)]` -> `color-mix(in srgb, var(--accent) 20%, transparent)` so it re-themes. Intentional residual hex (documented in code) - `WaterfallChart.tsx`: `#F97316` critical-path orange (iii-specific signal, not a theme color; comment in place). - `SpanOtelLogsTab.tsx`: `#F59E0B` iii-amber for OTel warn severity. Global `--warning` token is yellow (= accent), so warn uses its own amber to differentiate. Comment in `SEVERITY_STYLES` explains the rationale and suggests `--severity-warn` as a future token. Canvas theming gap (documented, not fixed) - `FlameGraph.tsx` (19 hex) and `TraceMap.tsx` (8 hex) draw to <canvas> via `ctx.fillStyle` / `ctx.strokeStyle`. Canvas does not pick up Tailwind classes or CSS custom properties. Each file now has a THEMING NOTE in its header describing the recommended refactor pattern: read CSS custom properties via getComputedStyle on render (or theme-change MutationObserver) and pass resolved hex into draw calls. Until applied, these two views render in fixed dark-mode colors regardless of host theme — flagged for a porter, not a blocker for the current console. Verified - `pnpm exec tsc --noEmit` clean - `pnpm test` 60/60 green
`useTraceFilters` had the camelCase->snake_case translation and
range-swap validation buried inside a `useMemo` — pure logic with
no React dep, but untestable without a render. Extracted to
`lib/traceFilters.ts` so a porter can reuse or replace the logic
without depending on the hook surface.
What moved
- `buildFilterParams(filters): { params, warnings }` — pure
translation + range validation. Swaps inverted duration min/max
and inverted startTime/endTime, flagging each in the returned
warnings.
- `countActiveFilters(filters): number` — pure non-default-field
counter used by the chip-count badge.
The hook now delegates and stays focused on debouncing, persistent
state, and reset semantics. Behavior unchanged.
Tests added (24)
- Happy-path camelCase -> snake_case translation for every field
- Duration-range: passthrough, swap-and-flag, single-bounded,
equal-bounds edge case
- Time-range: passthrough, swap-and-flag, single-bounded,
independent swap of duration AND time-range in same call
- Sort: sortBy + sortOrder passthrough
- Active-filter count: zero defaults, each individual field, empty
array rejection, default-value rejection, sum of multiple
Verified
- `pnpm exec tsc --noEmit` clean
- `pnpm test` 84/84 green (was 60 before this commit)
`useTraceData` had the StoredSpan -> TraceListItem transformation inline. Extracted to `lib/traceListItem.ts` so the mapping is testable without a hook render and reusable in non-React contexts (server render, alternate transport, fixtures). What moved - `mapSpanToListItem(span): TraceListItem` — pure span->row mapping. Handles unix-nano time conversion, status normalization (case-insensitive "error" match), OTel semantic attribute lookup (faas.invoked_name with function_id fallback, messaging.destination.name), and service-name "unknown" fallback. - `normalizeSpanAttributes(attrs)` — flattens the dual attribute encodings the engine emits (array-of-tuples OR plain object) into a single object. Was buried inside the hook. - `fingerprintTraceList(traces): string` — identity fingerprint helper for fetch-result dedup. Joins all trace IDs in order so middle-only churn isn't missed (was the latent bug in the prior first/last/count fingerprint). The hook now delegates and stays focused on query orchestration, hover-defer behavior, and new-trace highlight tracking. Tests added (30) - normalizeSpanAttributes: array-of-tuples shape, plain-object shape, undefined/null inputs, malformed tuples (length < 2), non-string key coercion, no-mutation contract. - mapSpanToListItem basic shape: ns->ms time conversion using realistic 2024 timestamps that exercise the divide-by-1M branch of `toMs`, duration computation, trace_id/name/service_name forwarding, "unknown" fallback, spanCount always = 1. - Status normalization (case-insensitive "error"): table-driven test across "error"/"Error"/"ERROR"/"errOr" and the ok/OK/unset/UNSET/empty/other cases. - Semantic attributes: faas.invoked_name primary, function_id legacy fallback, faas wins when both present, messaging.destination.name -> topic, plain-object attrs shape also works. - fingerprintTraceList: stable for identical lists, differs when order changes (catches sort-direction flips), differs when ONLY the middle item changes (the prior fingerprint's blind spot), distinct value for empty lists. Verified - `pnpm exec tsc --noEmit` clean - `pnpm test` 114/114 green (was 84 before this commit)
…bject - `biome check --write` across the traces feature: import ordering and line-wrap formatting harmonized to project style. No behavior change. - `useTraceFilters` filter-only-params memo: switched dep array from 10 individual filter fields to `[filters]`. Biome flagged the field-list as more-specific-than-capture; the field-level optimization was protecting downstream queryKey identity, but TanStack Query compares queryKeys structurally so the savings were illusory. - `useTraceData`: `React.MutableRefObject<boolean>` -> `React.RefObject<boolean>` (deprecated in React 19; `RefObject` is the current type). Verified - `pnpm exec tsc --noEmit` clean - `pnpm test` 114/114 green - `biome check` clean (0 errors, 0 warnings) on the touched feature
30 more unit tests across the two pre-existing lib files that were the last untested pure surface in the traces feature. `traceUtils.test.ts` (20 tests) - formatDuration: 0μs/μs/ms/s tier transitions, 2-decimal precision - formatRelative: μs/ms/s tiers, negative-value recursive quirk (current "-+Xms" output is pinned, not corrected) - classifySpanType: enqueue (publish + destination.name), trigger (name prefix + iii.function.kind attr), function default, enqueue-wins-over-trigger ordering, missing-attributes fallback - getServiceName: explicit service_name wins, dot-segment fallback, flat-name passthrough, empty-string-is-falsy edge case `traceTransform.test.ts` (10 tests) - toMs: ms-magnitude passthrough, ns-magnitude divide-by-1e6, NANO_THRESHOLD boundary (> not >=), non-finite -> 0 - calculateDurationMs: ms-input diff, ns-input diff with toBeCloseTo for the float64 precision loss inherent to dividing 1.7e18-scale values, mixed-magnitude inputs (one ns one ms), end < start guard, NaN coercion quirk (NaN -> 0 via toMs, NOT short-circuit to 0 — pinned as a known caller-validates contract) Verified - `pnpm exec tsc --noEmit` clean - `pnpm test` 144/144 green (was 114 before this commit)
…20 tests `buildSpanTree` (parent linking + critical-path marking) and `flattenTree` (depth-offset hide + handle_invocation+call pair merge) were the trickiest pure functions in the entire feature and lived inline inside the 560-line `WaterfallChart` component. Extracted to `lib/spanTree.ts` so a porter can: - Reuse the same waterfall layout against a non-React renderer (canvas, SVG export, screen-reader text view). - Customize the routing-pair merge rules by replacing the lib (or via a future config-injection refactor). - Trust the depth-offset logic via tests instead of staring at a recursive traverse. What moved - `SpanNode`, `FlatSpanRow`, `FlattenOptions` interfaces - `buildSpanTree(spans)` — parent linking, critical-path greedy DFS - `flattenTree(nodes, opts)` — DFS pre-order flatten with hideEngineRouting depth-offset and collapseEngineRoutingPairs merge Tests added (20) - buildSpanTree linking: empty input, no-parent->root, parent linking, missing-parent fallback to root, input-order preservation - buildSpanTree critical path: linear chain, sibling slow-wins, cumulative-vs-immediate (deep slow descendant beats wider immediate sibling), recursive unmark of non-critical subtrees - flattenTree basics: DFS pre-order, collapsed-node hides subtree, default displayDepth equals node.depth - flattenTree hideEngineRouting: single-hidden-ancestor depth shift by 1, stacked depth shift through multiple hidden ancestors, Math.max(0,...) clamp, non-engine spans unchanged - flattenTree collapseEngineRoutingPairs: merge single handle_invocation/call pair, no merge with multiple children, no merge with mismatched function names, hide-precedes-merge Verified - `pnpm exec tsc --noEmit` clean - `pnpm test` 164/164 green (was 144 before this commit) - `biome check` clean on all three files
Contributor
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Contributor
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Migrates the console TRACES page off the HTTP transport and onto the
iii-browser-sdkWebSocket bridge (enginebridge_port), then takes the feature through a portability + code-quality pass. Result: 164 unit tests where there were 0, every Tailwind hex literal in the feature converted to design tokens, hooks broken down into pure-logic libs, and an injectableEngineSdkProviderso the feature can be embedded outside this console.Why
The traces page was the only console feature still on the HTTP transport. Migrating it brings the whole console onto one transport surface (the SDK) and removes a parallel codepath. Once on the SDK, the natural follow-up was "could a different React app reuse this feature?" — this PR answers yes and lays the test/seam groundwork.
What's in here
Transport
console-rust: surfacebridge_portonServerConfigand emit it asbridgePortin the runtime config payload.iii-browser-sdkdep added toconsole-frontend;EngineSdkProviderwraps the app.api/observability/traces.tsrewritten on top ofISdk.triggerwith per-function timeouts, genericstripUndefined<T>, and a memory-exporter-not-enabled fallback for list/tree (clear still rethrows by design).TRACES_RPC_FUNCTIONSwith versioning JSDoc.Recovered features (from stash)
lib/groupTraces.ts,hooks/useTraceGroups.ts,components/traces/TraceGroupsView.tsx,components/traces/SessionDetailPanel.tsx, group-by popover inTraceFilters.lib/spanLabel.tsengine-routing heuristics.SpanLogsTab: JSON pretty-printer foriii.payload.jsonattributes.Portability + theming
bg-sidebar,border-border-subtle,text-accent, ...) so the feature re-themes under[data-theme=\"light\"]and in any host that overrides the CSS custom properties.STATUS_CONFIG(lib/traceUtils.ts) usesvar(--success/error/muted)+color-mix()for subtle tints.WaterfallChart+FlowViewusecolor-mix(in srgb, var(--accent) NN%, transparent).EngineSdkProvideracceptssdk?: ISdk(consumer-owned lifecycle) andbridgeUrl?: string(override theConfigProviderchain). Default behavior unchanged.WaterfallChart,api/observability/traces.ts,SessionDetailPanel,FlameGraph,TraceMap.TraceGroup(per-trace view-model) renamed toTraceListItemto deconflict with the server-side aggregate.Pure-logic extractions (all from hooks/components → testable libs)
lib/groupTraces.ts— group-by labels, summarize, wildcard sentinel rewritelib/spanLabel.ts— engine routing detection, label formattinglib/formatPossibleJson.ts— JSON pretty-print detectionlib/traceFilters.ts— camelCase→snake_case translation + range-swap validationlib/traceListItem.ts— span→view-model mapper + identity fingerprintlib/spanTree.ts— parent linking, critical-path greedy DFS, depth-offset hide, routing-pair mergeBehavior fixes
SessionDetailPanelnow lazy-fetches per card (was firing N paralleluseQuerieson mount — a 100-trace session opened 100 simultaneous WebSocket calls).useTraceDatafingerprint hashes all trace IDs (was first/last/count — would have missed middle-only churn if sort order ever flipped).isGroupByUnavailableregex tightened from/function_not_found|group_by|404/to/function_not_found/i. The old version silently swallowed legitimate errors that happened to contain "group_by" or "404" in the message.console.warn/console.errorinroutes/traces.tsxcleaned up; survivingconsole.errorgated behindimport.meta.env.DEV.'use client'no-ops dropped (3 files — Vite project, copy-paste from a Next.js reference).MutableRefObject→RefObjectdeprecation cleared.Tests added
api/observability/traces.test.tslib/groupTraces.test.tslib/spanLabel.test.tslib/formatPossibleJson.test.tslib/traceFilters.test.tslib/traceListItem.test.tslib/traceUtils.test.tslib/traceTransform.test.tslib/spanTree.test.tsOut of scope (deferred)
@testing-library/react+ jsdom infra.FlameGraph+TraceMap— documented refactor pattern in both file headers (getComputedStyle+ theme-change observer).WaterfallChartvirtualization —@tanstack/react-virtualintegration.api/queries.tsper-feature split — affects every console feature, not just traces.Test plan
pnpm exec tsc --noEmitcleanpnpm test164/164 greenbiome checkclean on touched filesiii dev, navigate to /traces, verify list, click a trace, see waterfall, toggle the new "Hide engine routing" + "Collapse routing pairs" checkboxesiii.payload.json— verify the JSON renders as a pretty-printed<pre>blockCommit stack