test: source-level query regressions to verify benchmark detection#740
Closed
BrianWhitneyAI wants to merge 2 commits into
Closed
test: source-level query regressions to verify benchmark detection#740BrianWhitneyAI wants to merge 2 commits into
BrianWhitneyAI wants to merge 2 commits into
Conversation
BFF Query Benchmark Results
Wide schema results (p50 ms)
p95 timings (narrow schema)
Summary27 regressions (≥25% slower):
Benchmarks run in headless Chromium with DuckDB-WASM. Each query: 1 warm-up + 10 timed iterations. Flags: |
a09f226 to
f7b151a
Compare
bfcf726 to
f44b3ad
Compare
241d0de to
2cf6f30
Compare
2cf6f30 to
a4f2ad7
Compare
Three realistic changes a developer might plausibly commit, each targeting a different benchmark query: 1. SQLBuilder.regexMatchValueInList — wrap column in LOWER() for case-insensitive matching. Forces a per-row function call on every scanned row. Affects: text_search, multi_column_filter. 2. buildDistinctValuesSQL — add ORDER BY 1 so dropdown values come back sorted. Adds a sort pass after hash-distinct. Affects: distinct_values. 3. buildFetchAnnotationsSQL — add ORDER BY column_name for predictable schema column ordering. Adds a sort pass on the result set. Affects: fetch_annotations. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add two more expensive query patterns to the slow-test branch: - COUNT(DISTINCT hidden_bff_uid) instead of COUNT(*) — forces full hash aggregation over all rows, significantly slower at large scales - MD5(CAST(hidden_bff_uid AS VARCHAR)) as secondary sort key in buildGetFilesSQL — forces MD5 computation per row for every paginated query, impacting sort_and_paginate, filter_by_size, multi_column_filter, and text_search Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
a4f2ad7 to
f05d1fb
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Regression test PR for the benchmark system (PR #739). Introduces five realistic source-level changes — the kind a developer might plausibly commit — to verify the benchmark detects them automatically against parquet-backed views at realistic data scales.
Because
queries.tscalls the samebuild*SQLfunctions as the app, no changes to the benchmark code were needed. All regressions are picked up purely from changes topackages/core/.Changes (all in
packages/core/)1.
SQLBuilder.regexMatchValueInList— wrap column inLOWER()for case-insensitive matchingForces a per-row
LOWER()call on every scanned row. Affects:text_search,multi_column_filter.2.
buildDistinctValuesSQL— addORDER BY 1for sorted dropdown valuesAdds a sort pass after hash-distinct. Affects:
distinct_values(especially wide schema with high-cardinality columns).3.
buildFetchAnnotationsSQL— addORDER BY column_namefor predictable schema orderingAdds a sort pass on the schema introspection result. Affects:
fetch_annotations.4.
buildGetCountSQL— useCOUNT(DISTINCT hidden_bff_uid)instead ofCOUNT(*)Forces full hash aggregation over all rows instead of a simple counter. Affects:
count_all. Regression scales dramatically with row count — at 10M rows: 6ms → 1001ms (+16,000%).5.
buildGetFilesSQL— addMD5(CAST(hidden_bff_uid AS VARCHAR))as secondary sort keyForces MD5 computation for every row before the top-N can be returned. Affects:
sort_and_paginate. At 10M rows: 2662ms → 4451ms (+67%).Observed results (vs PR #739, latest run)
count_allcount_allcount_allsort_and_paginatesort_and_paginatesort_and_paginatedistinct_values(wide)text_searchmulti_column_filterfetch_annotationsfilter_by_sizeCloud queries (100k and 1M rows over HTTP) show the same regression pattern, confirming the HTTP range-request code path is also covered.
Test plan
count_allshows massive regression fromCOUNT(DISTINCT)at all scalessort_and_paginateshows large regression from MD5 secondary sort at all scalestext_searchandmulti_column_filtershow consistent ~15-19% regression fromLOWER()overheaddistinct_valueswide schema shows large regression from added sort pass on high-cardinality datafetch_annotationsshows modest regression from added sort passfilter_by_sizeshows near-zero delta (correctly unaffected)🤖 Generated with Claude Code