Skip to content

feat(metrics): column-store + ragged-series + accumulator protocol foundation#969

Open
ajcasagrande wants to merge 2 commits into
mainfrom
ajc/metrics-foundation
Open

feat(metrics): column-store + ragged-series + accumulator protocol foundation#969
ajcasagrande wants to merge 2 commits into
mainfrom
ajc/metrics-foundation

Conversation

@ajcasagrande
Copy link
Copy Markdown
Contributor

@ajcasagrande ajcasagrande commented May 20, 2026

Summary

Foundational layer for a column-store-backed metrics accumulator pipeline. This PR ships only the protocol surface and the two list backends — no accumulator orchestration, no derived-latency math, no exporter rewires. Those land in follow-up PRs.

What it does

  • Introduces MetricSeriesProtocol (formerly MetricAggregator) as the runtime-checkable contract that every per-metric series backend must satisfy. The protocol moves from aiperf.metrics.metric_dicts to aiperf.common.accumulator_protocols; the old name remains as a back-compat alias.
  • Adds ColumnStore — a per-record-indexed store that holds one backend instance per metric and dispatches uniformly to either backend via add_for_record(idx, values).
  • Adds two backends:
    • RaggedSeries (new) — list-of-lists storage retaining every per-record observation; supports per-record replay.
    • TDigestListMetricAggregator (already on main from feat(metrics): t-digest aggregator for InterChunkLatencyMetric #865) — slotted in as a column-store backend after gaining add_for_record and SUPPORTS_PER_RECORD_REPLAY = False to satisfy the protocol.
  • Backend selection via AIPERF_METRICS_LIST_BACKEND={ragged,tdigest} (Environment.METRICS.LIST_BACKEND, default ragged).

What's deliberately deferred (later PRs)

  • MetricsAccumulator orchestration class
  • accumulator_models.py, accumulator_sweeps.py, derived_latency.py, base_usage_record_metric.py
  • base_aggregate_metric.py aggregation_kind, base_metric.py console_group + classmethod conversions
  • Records-manager wiring, exporter wiring, branch-stats plumbing
  • metric_result_from_array(), observation_duration(), window_start_ns, window_end_ns in metric_dicts.py — these have no main-side callers and only get used by deferred modules; included in PR-7

Preserved invariants (verified)

Files

Status File Lines
new `src/aiperf/common/accumulator_protocols.py` 191
new `src/aiperf/metrics/ragged_series.py` 107
new `src/aiperf/metrics/column_store.py` 503
new `src/aiperf/metrics/_column_store_handlers.py` 74
modified `src/aiperf/metrics/list_metric_aggregation.py` +25 / -7
modified `src/aiperf/metrics/metric_dicts.py` +20 / -12
modified `src/aiperf/metrics/init.py` +2
modified `src/aiperf/common/environment.py` +4
modified `docs/environment-variables.md` +1 (auto-generated)
modified `tools/ergonomics_baseline.json` +5 (column_store.py 503-line carve-out)
new `tests/unit/common/test_accumulator_protocols.py` 252
new `tests/unit/metrics/test_column_store.py` 440
new `tests/unit/metrics/test_ragged_series.py` 125

Total: 13 files, +1747 / -20.

Notes

  • The 4 net-new `.py` files are byte-identical to the source feature branch (`ajc/inferencex-agentx-mvp`). The two rewires (`list_metric_aggregation.py`, `metric_dicts.py`) are minimal protocol-conformance shims — they drop references to PR-7-deferred symbols (`MetricsAccumulator`, `accumulator_sweeps`) from docstrings so the file imports cleanly without those modules present.
  • `column_store.py` lands at exactly 503 lines (3 over the ergonomics file-size cap). The baseline carve-out in `tools/ergonomics_baseline.json` is the standard mechanism per the repo's documented workflow.
  • `RaggedSeries` deliberately does not conform to `MetricSeriesProtocol` directly — it's a backend (`add_for_record` + `SUPPORTS_PER_RECORD_REPLAY` only), not a top-level series. The full protocol is satisfied at the column-store layer.

Test plan

  • `uv run pytest tests/unit/ -n auto` — 12741 passed, 79 skipped, 0 failures (~32s)
  • `pre-commit run --all-files` — all 30 hooks pass (check-ergonomics, check-ruff-baselined, ruff, ruff format, generate-env-vars-docs, validate-plugin-schemas, etc.)
  • Independent validation pass: protocol substitutability verified end-to-end (`isinstance(TDigestListMetricAggregator(), MetricSeriesProtocol)`, `add_for_record`, `len`, `SUPPORTS_PER_RECORD_REPLAY` all work at runtime).
  • Circular-import audit clean (both fresh-interpreter and `aiperf.common.enums`-first import orders succeed).
  • Trim safety: `git grep` on `origin/main` for `metric_result_from_array`, `observation_duration`, `window_start_ns`, `window_end_ns` returns zero hits; no main-side callers exist.
  • `Environment.METRICS.LIST_BACKEND` references confined to `column_store.py` (3 sites) + `environment.py` (1 site) + auto-generated docs.
  • `_resolve_list_backend_class()` returns the correct backend class for both `ragged` and `tdigest` settings.

What downstream PRs need from this one

Once this lands, the follow-up `MetricsAccumulator` PR can:

  1. Import `MetricSeriesProtocol` from `aiperf.common.accumulator_protocols`.
  2. Instantiate `ColumnStore` with the configured backend.
  3. Dispatch per-record observations via `add_for_record(idx, values)` without caring which backend is in use.
  4. Replay or stream records by querying `SUPPORTS_PER_RECORD_REPLAY` and choosing the appropriate code path.

Summary by CodeRabbit

  • New Features

    • Added configurable metrics storage backend selection for precise versus approximate percentile calculations.
  • Documentation

    • Documented new environment variable for metrics backend configuration.

Review Change Stack

This is PR-6 of a multi-PR breakdown of the metrics-accumulator
foundation work from ajc/inferencex-agentx-mvp. The goal is to land
the storage / protocol layer that the upcoming MetricsAccumulator
(PR-7) plugs into, without dragging the accumulator itself in yet.

Net-new files (copied verbatim from source branch):
  - src/aiperf/common/accumulator_protocols.py   (191 lines)
    MetricSeriesProtocol, AccumulatorProtocol, AnalyzerProtocol,
    StreamExporterProtocol, AccumulatorResult, ExportContext,
    SummaryContext. Only runtime import is MetricValueTypeVarT from
    aiperf.common.enums.metric_enums; everything else is gated behind
    TYPE_CHECKING so plugin-load doesn't trip a circular import.
  - src/aiperf/metrics/ragged_series.py          (107 lines)
    CSR-style storage for list-valued per-record metrics
    (inter_chunk_latency today). Exposes add_for_record(idx, values),
    grouped_cumsum(), get_values_for_mask(); declares
    SUPPORTS_PER_RECORD_REPLAY = True so the future sweep helpers can
    gate ICL-aware curves on backend capability.
  - src/aiperf/metrics/column_store.py           (503 lines)
    Session-indexed NaN-sparse columnar storage. Holds numeric,
    string, list, and 4 flavors of metadata columns (numeric, string,
    bool with uint8 sentinel, categorical with int32-intern). Per-tag
    setter closures cache the type-dispatch first sighting and skip
    the isinstance ladder on subsequent records. Backend class is
    picked from Environment.METRICS.LIST_BACKEND ("ragged" default,
    "tdigest" for bounded memory).
  - src/aiperf/metrics/_column_store_handlers.py  (74 lines)
    Closure factories for ColumnStore.ingest dispatch.

Rewires (minimal, protocol-conformance only):
  - src/aiperf/metrics/list_metric_aggregation.py
    Subsumes the existing TDigestListMetricAggregator (#865) under
    MetricSeriesProtocol: adds __len__, add_for_record(idx, values)
    record-keyed alias (idx ignored — t-digest is a global sketch),
    SUPPORTS_PER_RECORD_REPLAY = False flag, __all__, and updates
    docstring references from MetricAggregator -> MetricSeriesProtocol.
  - src/aiperf/metrics/metric_dicts.py
    Replaces inline Protocol definition with an import of
    MetricSeriesProtocol from accumulator_protocols and a back-compat
    alias (MetricAggregator = MetricSeriesProtocol). Adds __all__ for
    explicit re-export. Existing callsites (derived_sum_metric.py,
    test_list_metric_aggregation.py) keep working through the alias.
  - src/aiperf/metrics/__init__.py
    Re-exports MetricSeriesProtocol.
  - src/aiperf/common/environment.py
    Adds METRICS.LIST_BACKEND: Literal["ragged", "tdigest"] field
    (default "ragged"), consumed by ColumnStore._resolve_list_backend_class().
    docs/environment-variables.md auto-regenerated.

Test files ported alongside the new modules:
  - tests/unit/common/test_accumulator_protocols.py   (252 lines)
  - tests/unit/metrics/test_column_store.py           (440 lines)
  - tests/unit/metrics/test_ragged_series.py          (125 lines)
  All three exercise only the protocol / column-store / ragged-series
  surface — no MetricsAccumulator dependencies.

Deliberately deferred to downstream PRs:
  - metric_result_from_array() and observation_duration() / window_*
    additions on metric_dicts.py — the source branch adds these to
    feed the accumulator's vectorized result builder, but they have
    no on-main callers. Skipped to keep this PR scoped to the
    protocol + storage surface.
  - metrics/__init__.py re-export of metric_result_from_array (same
    reason).
  - accumulator.py / accumulator_models.py / accumulator_sweeps.py /
    derived_latency.py / base_aggregate_metric.py aggregation_kind /
    base_metric.py console_group / base_usage_record_metric.py — all
    pulled in by PR-7.
  - AggregationKind and EFFECTIVE/ACTIVE MetricConsoleGroup additions
    on metric_enums.py — PR-7 (only used by accumulator).

Preserves #881 schema 1.1: not touching metrics_json_exporter.py or
export_models.py in this PR.
Preserves #865 t-digest behavior: TDigestListMetricAggregator now
slots in as a ColumnStore list backend through add_for_record and
still passes its existing test suite unchanged.

Verification:
  - uv run pytest tests/unit/ -n auto -> 12741 passed, 79 skipped
  - pre-commit run on all staged files -> all hooks pass
  - tools/check_ergonomics.py baseline regenerated to admit
    column_store.py (503 lines, just over the 500 cap). The 503-line
    figure is a clean storage surface; splitting it would fragment
    cohesive column-type concerns.

No surprises: the warned-about circular import via
aiperf.common.enums never tripped — accumulator_protocols only
runtime-imports MetricValueTypeVarT from metric_enums (which is a
leaf module relative to plugin load), and ColumnStore's
Environment.METRICS.LIST_BACKEND lookup is already deferred inside a
factory function for monkey-patch friendliness.

Signed-off-by: Anthony Casagrande <acasagrande@nvidia.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 20, 2026

Try out this PR

Quick install:

pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@ea01d70ddd2aafc48c3aaf37b677076d2b60fc70

Recommended with virtual environment (using uv):

uv venv --python 3.12 && source .venv/bin/activate
uv pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@ea01d70ddd2aafc48c3aaf37b677076d2b60fc70

Last updated for commit: ea01d70Browse code

@github-actions github-actions Bot added the feat label May 20, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 20, 2026

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 20, 2026

Walkthrough

This PR introduces a columnar storage system for per-record metrics in aiperf. It defines protocol contracts for the metrics pipeline, implements pluggable list-value backends (ragged arrays and t-digest sketches), creates ColumnStore for structured metric ingestion with metadata tracking, and integrates the new MetricSeriesProtocol across the codebase.

Changes

Metrics Storage Pipeline

Layer / File(s) Summary
Pipeline protocol contracts and contexts
src/aiperf/common/accumulator_protocols.py, tests/unit/common/test_accumulator_protocols.py
Defines AccumulatorProtocol, AnalyzerProtocol, StreamExporterProtocol, and MetricSeriesProtocol as runtime-checkable protocols, plus SummaryContext and ExportContext dataclasses for carrying execution metadata and cancellation state through pipeline stages.
RaggedSeries list-value backend
src/aiperf/metrics/ragged_series.py, tests/unit/metrics/test_ragged_series.py
Implements RaggedSeries as a ragged array storage backend with flat values, per-record offsets, and support for masked selection and per-record cumulative summation; advertises SUPPORTS_PER_RECORD_REPLAY = True.
TDigestListMetricAggregator protocol conformance
src/aiperf/metrics/list_metric_aggregation.py
Updates TDigestListMetricAggregator to conform to MetricSeriesProtocol with a __len__ method for sample count and an add_for_record(idx, values) ingest path; sets SUPPORTS_PER_RECORD_REPLAY = False for bounded-memory sketch semantics.
ColumnStore initialization and handler closures
src/aiperf/metrics/_column_store_handlers.py, src/aiperf/metrics/column_store.py (constructor)
Introduces handler closure factories (make_numeric_handler, make_string_handler, make_list_handler) for efficient per-tag type dispatch during ingest, and ColumnStore constructor with pluggable backend selection, numeric/string/list/metadata storage layout, and timestamp initialization.
ColumnStore read and query APIs
src/aiperf/metrics/column_store.py (read methods)
Implements column access methods for numeric, string, and ragged metrics, running sum/count accessors, categorical metadata indexing with reverse lookup, and query_time_range() for selecting records by timestamp overlap.
ColumnStore ingest and growth
src/aiperf/metrics/column_store.py (write and growth methods)
Implements ingest() for per-record metrics with lazy per-tag handler caching and unsupported-type skipping, ingest_metadata() for numeric/string/bool/categorical metadata with string interning, and _grow() for dynamic capacity doubling with array reallocation and handler cache invalidation.
ColumnStore test coverage
tests/unit/metrics/test_column_store.py
Comprehensive unit tests covering initialization, numeric/string/list ingestion, out-of-order semantics, capacity growth and handler invalidation, metadata storage and accessors (including categorical interning and masking), time-range queries, and mixed-type record ingestion.
MetricSeriesProtocol integration
src/aiperf/metrics/metric_dicts.py, src/aiperf/metrics/__init__.py
Imports MetricSeriesProtocol from aiperf.common.accumulator_protocols, defines MetricAggregator as a back-compat alias, and re-exports from the metrics package namespace.
List backend selection and documentation
src/aiperf/common/environment.py, docs/environment-variables.md, tools/ergonomics_baseline.json
Adds Environment.METRICS.LIST_BACKEND configuration (default "ragged") to select between exact and approximate percentile storage, documents behavioral differences and ICL-aware fallbacks, and updates baseline for file-size violations.

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 A rabbit's ode to metrics flow,
Where columns store and backends show—
Ragged arrays dance, sketches hum,
Per-record ingestion, sums all sum!
Timestamps query, metadata sing,
What storage joy this PR doth bring! 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 41.55% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: introducing protocol foundations and two list storage backends (column-store and ragged-series) for the metrics accumulator pipeline.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (1)
tests/unit/common/test_accumulator_protocols.py (1)

128-130: ⚡ Quick win

Align test names with the repository convention.

Several names (for example test_to_json, test_to_csv, test_time_range) do not follow the required test_<function>_<scenario>_<expected> format.

As per coding guidelines: tests/**/*.py: Test naming convention: test_<function>_<scenario>_<expected>.

Also applies to: 155-252

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/common/test_accumulator_protocols.py` around lines 128 - 130,
Rename test functions to follow the repository convention
test_<function>_<scenario>_<expected>; specifically update
test_protocol_isinstance_check and other tests in this file (examples:
test_to_json, test_to_csv, test_time_range and tests in the 155-252 range) to
descriptive names that include the function under test, the scenario, and the
expected outcome (e.g.,
test_protocol_isinstance_with_matching_protocol_returns_true); ensure any
parametrization or references (fixtures, marks) use the new names.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/aiperf/metrics/_column_store_handlers.py`:
- Around line 44-47: The numeric handler function (handler) currently assigns
value into col and updates sums[tag] and counts[tag] without validating numeric
finiteness; guard against NaN/Inf by checking the value with the
aiperf.common.finite utilities (e.g., is_finite or similar) before mutating
col/sums/counts, and if the value is not finite treat it as None/skip updating
sums and counts (but still set col[idx] appropriately if required by your data
model); update the logic around col[idx] = value, sums[tag] = sums[tag] + value,
and counts[tag] = counts[tag] + 1 to perform the finite check and only modify
sums/counts when the check passes.

In `@src/aiperf/metrics/column_store.py`:
- Around line 282-307: The ingest method currently allows negative idx values
which trigger Python negative indexing and can corrupt data; add a defensive
check at the top of ingest (before _grow) to raise a ValueError if idx < 0, e.g.
"if idx < 0: raise ValueError(...)", and apply the same guard to the other write
API in this file (the analogous method around lines 339-363) so both methods
validate idx, avoiding silent negative-index writes; update/add tests to assert
that negative idx raises.
- Around line 86-93: The constructor ( __init__ ) currently allows
initial_capacity == 0 which makes _grow() stuck because 0 * 2 == 0; validate
initial_capacity > 0 at start of __init__ (raise ValueError or coerce to 1) and
ensure _capacity is set to at least 1 so subsequent _grow() doubles progress;
apply the same validation/fix to the other initialization block in this module
that also assigns _capacity (the other __init__/constructor around the similar
capacity setup).

In `@src/aiperf/metrics/ragged_series.py`:
- Around line 30-32: The constructor currently allows initial_capacity or
offsets_capacity to be 0 which causes _grow_offsets() (and the similar grow
routine around lines 99-104) to loop forever because new_cap stays 0; fix by
validating inputs in __init__ (e.g., raise ValueError if initial_capacity <= 0
or offsets_capacity <= 0) and harden the grow logic in _grow_offsets() and the
other grow method to ensure growth always increases capacity (compute new_cap =
max(1, old_cap * 2) or if old_cap == 0 set new_cap = 1) so a zero capacity
cannot produce an infinite loop.
- Around line 58-66: The extend method currently allows negative idx which
causes negative-index writes into _offsets; add an explicit check at the top of
extend (in the extend(self, idx: int, values: list[float]) method) to raise a
ValueError if idx < 0 before calling _grow_offsets or writing to self._offsets,
so negative indices are rejected and no silent corruption of _offsets/_values
can occur.

In `@tests/unit/common/test_accumulator_protocols.py`:
- Around line 63-64: Replace string-typed keys with AccumulatorType instances in
the protocol contract tests: update required_accumulators, summary_dependencies,
and any usages asserting SummaryContext.accumulators to use values
typed/constructed as AccumulatorType rather than plain str so tests validate the
actual contract; search for occurrences of required_accumulators and
summary_dependencies in the test file (including lines ~201-203) and change
their elements to AccumulatorType values consistent with the production
enum/type, and adjust any assertions that compare keys to expect AccumulatorType
instances.

---

Nitpick comments:
In `@tests/unit/common/test_accumulator_protocols.py`:
- Around line 128-130: Rename test functions to follow the repository convention
test_<function>_<scenario>_<expected>; specifically update
test_protocol_isinstance_check and other tests in this file (examples:
test_to_json, test_to_csv, test_time_range and tests in the 155-252 range) to
descriptive names that include the function under test, the scenario, and the
expected outcome (e.g.,
test_protocol_isinstance_with_matching_protocol_returns_true); ensure any
parametrization or references (fixtures, marks) use the new names.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 92a02970-ded5-4d52-a593-cc196d241536

📥 Commits

Reviewing files that changed from the base of the PR and between bb6f421 and 202fc65.

📒 Files selected for processing (13)
  • docs/environment-variables.md
  • src/aiperf/common/accumulator_protocols.py
  • src/aiperf/common/environment.py
  • src/aiperf/metrics/__init__.py
  • src/aiperf/metrics/_column_store_handlers.py
  • src/aiperf/metrics/column_store.py
  • src/aiperf/metrics/list_metric_aggregation.py
  • src/aiperf/metrics/metric_dicts.py
  • src/aiperf/metrics/ragged_series.py
  • tests/unit/common/test_accumulator_protocols.py
  • tests/unit/metrics/test_column_store.py
  • tests/unit/metrics/test_ragged_series.py
  • tools/ergonomics_baseline.json

Comment on lines +44 to +47
def handler(idx: int, value: Any) -> None:
col[idx] = value
sums[tag] = sums[tag] + value
counts[tag] = counts[tag] + 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard numeric ingest against non-finite values.

This path updates running numeric aggregates directly; accepting NaN/Inf will contaminate sums and derived stats.

Proposed fix
+from aiperf.common.finite import is_finite_value
...
     def handler(idx: int, value: Any) -> None:
+        if not is_finite_value(value):
+            return
         col[idx] = value
         sums[tag] = sums[tag] + value
         counts[tag] = counts[tag] + 1

As per coding guidelines: “Numeric metric values crossing a serialization boundary or feeding a numerical algorithm must be finite or explicitly None - use aiperf.common.finite utilities”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/aiperf/metrics/_column_store_handlers.py` around lines 44 - 47, The
numeric handler function (handler) currently assigns value into col and updates
sums[tag] and counts[tag] without validating numeric finiteness; guard against
NaN/Inf by checking the value with the aiperf.common.finite utilities (e.g.,
is_finite or similar) before mutating col/sums/counts, and if the value is not
finite treat it as None/skip updating sums and counts (but still set col[idx]
appropriately if required by your data model); update the logic around col[idx]
= value, sums[tag] = sums[tag] + value, and counts[tag] = counts[tag] + 1 to
perform the finite check and only modify sums/counts when the check passes.

Comment on lines +86 to +93
def __init__(
self,
initial_capacity: int = 1024,
*,
list_backend_cls: type[ListMetricBackendT] | None = None,
) -> None:
self._capacity = initial_capacity
self._count = 0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Require initial_capacity > 0 to avoid non-terminating growth.

With initial_capacity == 0, _grow() never progresses (0 * 2 == 0), causing an infinite loop.

Proposed fix
 def __init__(
     self,
     initial_capacity: int = 1024,
     *,
     list_backend_cls: type[ListMetricBackendT] | None = None,
 ) -> None:
+    if initial_capacity <= 0:
+        raise ValueError("initial_capacity must be > 0")
     self._capacity = initial_capacity

Also applies to: 393-402

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/aiperf/metrics/column_store.py` around lines 86 - 93, The constructor (
__init__ ) currently allows initial_capacity == 0 which makes _grow() stuck
because 0 * 2 == 0; validate initial_capacity > 0 at start of __init__ (raise
ValueError or coerce to 1) and ensure _capacity is set to at least 1 so
subsequent _grow() doubles progress; apply the same validation/fix to the other
initialization block in this module that also assigns _capacity (the other
__init__/constructor around the similar capacity setup).

Comment on lines +282 to +307
def ingest(
self,
idx: int,
*,
record_metrics: dict[str, Any],
start_ns: float,
end_ns: float,
generation_start_ns: float | None,
) -> None:
"""Write a record's data to slot `idx` (= session_num).

Grows capacity if idx >= _capacity. Dispatches metric values via cached
per-tag setter closures — the isinstance ladder and ``_ensure_*_column``
lookups run only on the first record per tag. Profiling at 50k records
shows this hoists ~30% of ingest wall time vs the per-record dispatch.
"""
if idx >= self._capacity:
self._grow(idx)

if idx >= self._count:
self._count = idx + 1

self.start_ns[idx] = start_ns
self.end_ns[idx] = end_ns
if generation_start_ns is not None:
self.generation_start_ns[idx] = generation_start_ns
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Reject negative record indices in write APIs.

idx < 0 currently writes via Python negative indexing, which can silently overwrite the last slots and corrupt stored records/metadata.

Proposed fix
 def ingest(
     self,
     idx: int,
@@
 ) -> None:
+    if idx < 0:
+        raise ValueError("idx must be >= 0")
     if idx >= self._capacity:
         self._grow(idx)
@@
 def ingest_metadata(
     self,
     idx: int,
@@
 ) -> None:
+    if idx < 0:
+        raise ValueError("idx must be >= 0")
     if idx >= self._capacity:
         self._grow(idx)

Also applies to: 339-363

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/aiperf/metrics/column_store.py` around lines 282 - 307, The ingest method
currently allows negative idx values which trigger Python negative indexing and
can corrupt data; add a defensive check at the top of ingest (before _grow) to
raise a ValueError if idx < 0, e.g. "if idx < 0: raise ValueError(...)", and
apply the same guard to the other write API in this file (the analogous method
around lines 339-363) so both methods validate idx, avoiding silent
negative-index writes; update/add tests to assert that negative idx raises.

Comment on lines +30 to +32
def __init__(
self, initial_capacity: int = 1024, offsets_capacity: int = 256
) -> None:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Validate capacities to prevent infinite growth loops.

initial_capacity/offsets_capacity can be 0, which makes _grow_offsets() loop forever (new_cap *= 2 stays 0).

Proposed fix
 def __init__(
     self, initial_capacity: int = 1024, offsets_capacity: int = 256
 ) -> None:
+    if initial_capacity <= 0:
+        raise ValueError("initial_capacity must be > 0")
+    if offsets_capacity <= 0:
+        raise ValueError("offsets_capacity must be > 0")
     self._values = GrowableArray(
         initial_capacity=initial_capacity, dtype=np.float64
     )

Also applies to: 99-104

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/aiperf/metrics/ragged_series.py` around lines 30 - 32, The constructor
currently allows initial_capacity or offsets_capacity to be 0 which causes
_grow_offsets() (and the similar grow routine around lines 99-104) to loop
forever because new_cap stays 0; fix by validating inputs in __init__ (e.g.,
raise ValueError if initial_capacity <= 0 or offsets_capacity <= 0) and harden
the grow logic in _grow_offsets() and the other grow method to ensure growth
always increases capacity (compute new_cap = max(1, old_cap * 2) or if old_cap
== 0 set new_cap = 1) so a zero capacity cannot produce an infinite loop.

Comment on lines +58 to +66
def extend(self, idx: int, values: list[float]) -> None:
"""Append values for session_num ``idx``."""
n = len(values)
if n == 0:
return
if idx >= self._offsets_capacity:
self._grow_offsets(idx)
self._offsets[idx] = len(self._values)
val_arr = np.asarray(values, dtype=np.float64)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Reject negative record indices.

idx < 0 currently writes through negative indexing (_offsets[-1]), which can silently corrupt data layout.

Proposed fix
 def extend(self, idx: int, values: list[float]) -> None:
     """Append values for session_num ``idx``."""
+    if idx < 0:
+        raise ValueError("idx must be >= 0")
     n = len(values)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/aiperf/metrics/ragged_series.py` around lines 58 - 66, The extend method
currently allows negative idx which causes negative-index writes into _offsets;
add an explicit check at the top of extend (in the extend(self, idx: int,
values: list[float]) method) to raise a ValueError if idx < 0 before calling
_grow_offsets or writing to self._offsets, so negative indices are rejected and
no silent corruption of _offsets/_values can occur.

Comment on lines +63 to +64
required_accumulators: ClassVar[set[str]] = set()
summary_dependencies: ClassVar[list[str]] = []
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use AccumulatorType-typed keys in protocol contract tests.

These tests currently exercise SummaryContext.accumulators and analyzer dependency fields with str values, which weakens coverage of the declared AccumulatorType contract and can miss contract regressions.

Also applies to: 201-203

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/common/test_accumulator_protocols.py` around lines 63 - 64,
Replace string-typed keys with AccumulatorType instances in the protocol
contract tests: update required_accumulators, summary_dependencies, and any
usages asserting SummaryContext.accumulators to use values typed/constructed as
AccumulatorType rather than plain str so tests validate the actual contract;
search for occurrences of required_accumulators and summary_dependencies in the
test file (including lines ~201-203) and change their elements to
AccumulatorType values consistent with the production enum/type, and adjust any
assertions that compare keys to expect AccumulatorType instances.

def sum(self) -> MetricValueTypeVarT:
"""Return the accumulated sum of all observed values."""

def __len__(self) -> int:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding __len__ to the protocol while MetricAggregator is kept as a back-compat alias breaks existing custom aggregators that only implement the old sum/to_result runtime contract. Fix: keep MetricAggregator on the old protocol shape or avoid the stricter protocol for legacy isinstance checks.

return np.zeros(0, dtype=np.bool_)
rec_start = self.start_ns[: self._count]
rec_end = self.end_ns[: self._count]
return (rec_start <= end_ns) & (rec_end >= start_ns)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query_time_range violates the new protocol's [start_ns, end_ns) contract by using inclusive overlap semantics, which can double-count records across adjacent windows. Fix: align the mask with half-open membership semantics or update the protocol and all callers to explicitly require overlap semantics.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 20, 2026

Codecov Report

❌ Patch coverage is 87.84810% with 48 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/aiperf/metrics/column_store.py 81.60% 26 Missing and 20 partials ⚠️
src/aiperf/metrics/list_metric_aggregation.py 66.66% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@ajcasagrande ajcasagrande added the AgentX Feature for AgentX label May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AgentX Feature for AgentX feat

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants