feat: add storage slot lifecycle model#235
Open
weiihann wants to merge 10 commits intoethpandaops:masterfrom
Open
feat: add storage slot lifecycle model#235weiihann wants to merge 10 commits intoethpandaops:masterfrom
weiihann wants to merge 10 commits intoethpandaops:masterfrom
Conversation
Add int_storage_slot_lifecycle incremental model that tracks per-slot, per-lifecycle metrics: birth/death blocks, touch count, effective bytes, and touch-to-touch interval statistics (count, sum, max). Uses an arrayFold state machine to process events from int_storage_slot_diff_by_address_slot (births/deaths) and int_storage_slot_next_touch (all touches). A helper table caches current lifecycle state per slot for efficient cross-batch lookups. - Migration 076: int_storage_slot_lifecycle + helper tables - Transformation: dual-INSERT SQL with arrayFold state machine - Tests: 10 structural invariant assertions for pectra spec - Proto: reference schema for API generation (run make proto) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> refactor: replace helper tables with boundary table in migration 076 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> feat: add lifecycle boundary detection model Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> refactor: replace arrayFold with window functions in lifecycle model Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> feat: add boundary table test assertions and proto schema Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> test: add death bytes and uniqueness assertions for boundary table Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> refactor: replace FINAL with argMax dedup in lifecycle models FINAL forces sort-merge deduplication at query time on every batch. Replace with GROUP BY + argMax(col, updated_date_time) pattern already used elsewhere in the codebase (e.g. int_storage_slot_reactivation_6m). - boundary: subquery argMax with post-dedup birth/death filter - lifecycle batch_touches: GROUP BY on key columns only - lifecycle batch_diffs: flat argMax for effective_bytes_to Validated against 1M blocks (22M-23M): 15/16 assertions pass, 1 pre-existing boundary edge case at block 23M (15/120M rows). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> done Update CLAUDE.md: require field-level COMMENTs on all table DDLs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Add field-level COMMENTs to lifecycle migration tables Migration 076 had table-level COMMENTs but was missing inline field COMMENTs, unlike migration 037 which has them on every column. Add COMMENTs to all fields in both lifecycle_boundary and lifecycle tables. Add Migration Conventions section to CLAUDE.md documenting the requirement. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> revert remove
* master: (34 commits) fix(detect_impacted_models.sh): change output from "all" to "none" for non-existing tests directory to clarify behavior fix(workflow): reduce concurrency from 15 to 10 in mainnet tests to enhance stability and performance fix(workflows): increase test concurrency from 5 to 15 for better performance during mainnet tests chore(models): add .gitkeep file to models directory for maintaining version control refactor: removed dupe sepolia tests refactor(database): enhance isCBTTemplateReady function to improve robustness by checking for database existence and migration status refactor(engine): simplify the buildTestOverrides function by auto-generating sensible defaults from model cache and applying YAML overrides fix(overrides): update the testing overrides file to reflect auto-generated defaults and remove unnecessary hardcoded entries test(models): modify assertions in test YAML files to check for duplicate values instead of null conditions feat(database): enhance ValidateExternalData function to check for optional tables during parquet loading and improve validation logic fix(tests): enhance SQL tests with better validation checks and improved naming for readability and acceptance criteria refactor(tests): standardize assertions and SQL queries across multiple models for consistency and maintainability refactor: ensure boundaries on tests are gooch feat(tests): add dynamic resolution for column references in typed checks to enhance flexibility and accuracy in assertions fix(tests): reduce CBTConcurrency to limit contention under concurrent load for better performance during testing chore(engine): improve default overrides for models not in overrides file to optimize testing environment and execution fix(engine): reset pending timers for retried models to avoid premature timeout during retries, enhancing reliability in transformation processing chore(tests): remove deprecated YAML test model fct_block_blob_first_seen_by_node to clean up test suite feat: optimize model skill (ethpandaops#234) refactor: drop spec from tests chore(sql): remove unnecessary test comments from int_block_receipt_size.sql and int_transaction_receipt_size.sql to clean up the code chore(detect_impacted_models.sh): exclude CI and documentation files from impacting model detection logic to improve accuracy of the impacted models detection feat(workflows): add test-fusaka-mainnet workflow for detecting and testing impacted models in mainnet environment chore(int_transaction_receipt_size.sql): add test comment for clarity and future reference chore(detect_impacted_models.sh): update shebang to use env for better portability fix(detect_impacted_models.sh): adjust REPO_ROOT path to correctly reference the repository root directory chore(int_block_receipt_size.sql): add comment for testing purposes to clarify the intent of the code refactor(workflows): restructure GitHub Actions workflows to separate model detection and testing steps for clarity and efficiency chore(ci): update GitHub Actions workflows to use self-hosted runners for improved performance and customization chore(ci): specify paths for golangci-lint workflow to limit triggered events only to relevant files ...
…ifecycle tables and their structure test(tests): enhance lifecycle SQL tests to verify data integrity and correctness for storage slot lifecycle models
* master: refactor(tests): rename extractModelNames to extractCloneTableNames for clarity and include helper tables in extraction logic
…mprove reliability and resource management during execution.
…or managing lifecycle metrics feat(proto): implement List and Get requests for int_storage_slot_lifecycle with pagination and filters chore(proto): improve comments for clarity and update field descriptions in int_storage_slot_lifecycle.proto docs(proto): provide more detailed comments for lifecycle transitions and metrics in proto definition files feat(proto): add IntStorageSlotLifecycleBoundary messages to manage lifecycle boundaries with filtering options docs(proto): improve documentation for proto messages with clearer descriptions on fields and request responses
* master: chore(detect_impacted_models.sh): update ignore patterns to include migrations, proto, dependency, and build tooling files for improved impact detection
…fecycle models Move birth_block filter from HAVING to WHERE in boundaries CTE for better predicate pushdown, and add birth_block bounds filtering to prev_stats and prev_state self-queries to ensure lookups are properly bounded.
d5cf58e to
12133e5
Compare
* master: fix(detect_impacted_models.sh): update condition to check CHANGED_MODELS_COUNT for clarity and reliability in detecting impacted models
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add storage slot lifecycle models that track birth (0→non-zero effective bytes) and death (non-zero→0) transitions
per slot, with reincarnation tracking via
lifecycle_number.int_storage_slot_lifecycle_boundary(birth/deathdetection) and
int_storage_slot_lifecycle(per-lifecycle touch statistics and interval metrics). Includestable-level and field-level COMMENTs matching migration 037 conventions.
int_storage_slot_diff_by_address_slot, assigns lifecyclenumbers via cumulative birth count using window functions, self-references for cross-batch continuity.
(count/sum/max) per lifecycle.