[ENG-1296] set cap on expired orders processed per block #3243

anmolagrawal345 · 2025-11-05T19:36:30Z

Changelist

This change builds off the same pattern for setting caps on deleveraging and liquidations per block. We need to cap the expired orders removal per block to prevent the situation in which the end blocker is stuck removing a large number of orders (a potential attack vector).

Test Plan

[Describe how this PR was tested (if applicable)]

Author/Reviewer Checklist

If this PR has changes that result in a different app state given the same prior state and transaction list, manually add the state-breaking label.
If the PR has breaking postgres changes to the indexer add the indexer-postgres-breaking label.
If this PR isn't state-breaking but has changes that modify behavior in PrepareProposal or ProcessProposal, manually add the label proposal-breaking.
If this PR is one of many that implement a specific feature, manually label them all feature:[feature-name].
If you wish to for mergify-bot to automatically create a PR to backport your change to a release branch, manually add the label backport/[branch-name].
Manually add any of the following labels: refactor, chore, bug.

Summary by CodeRabbit

New Features
- Added block limits configuration system to control the rate of expired stateful order removal per block.
- Added new gRPC query endpoint and CLI command to retrieve block limits configuration.
- Added governance message to update block limits configuration.
Improvements
- Enhanced telemetry to track the count of expired orders removed during block processing.

linear · 2025-11-05T19:36:33Z

ENG-1296 Cap the number of endBlocker order removals

coderabbitai · 2025-11-05T19:36:44Z

Walkthrough

This PR implements a per-block cap mechanism for removing expired stateful orders in the CLOB module. It introduces a BlockLimitsConfig message with a MaxStatefulOrderRemovalsPerBlock parameter, modifies the RemoveExpiredStatefulOrders keeper method to enforce this cap with batching across blocks, adds telemetry tracking, and exposes query and update endpoints for governance-mediated configuration changes.

Changes

Cohort / File(s)	Change Summary
Proto definitions `proto/dydxprotocol/clob/block_limits_config.proto`, `proto/dydxprotocol/clob/query.proto`, `proto/dydxprotocol/clob/tx.proto`	New `BlockLimitsConfig` message with `max_stateful_order_removals_per_block` field; new `QueryBlockLimitsConfigurationRequest/Response` messages and RPC; new `MsgUpdateBlockLimitsConfig/Response` messages and RPC for governance updates
TypeScript codegen – clob module `indexer/packages/v4-protos/src/codegen/dydxprotocol/clob/block_limits_config.ts`, `query.lcd.ts`, `query.rpc.Query.ts`, `query.ts`, `tx.rpc.msg.ts`, `tx.ts`	Generated TypeScript interfaces, encode/decode logic, and client methods for `BlockLimitsConfig`, `QueryBlockLimitsConfiguration`, and `MsgUpdateBlockLimitsConfig`
TypeScript codegen – bundle reorganization `indexer/packages/v4-protos/src/codegen/dydxprotocol/bundle.ts`, `gogoproto/bundle.ts`, `google/bundle.ts`	Re-mapped internal module alias spreads throughout dydxprotocol namespaces and nested modules; updated vest and ClientFactory exports
Go keeper – block limits config `protocol/x/clob/keeper/block_limits_config.go`, `grpc_query_block_limits_configuration.go`, `msg_server_update_block_limits_config.go`	New keeper methods `GetBlockLimitsConfig`, `UpdateBlockLimitsConfig`, and gRPC query handler; new message server handler for governance updates
Go keeper – stateful order removal with cap `protocol/x/clob/keeper/stateful_order_state.go`	Modified `RemoveExpiredStatefulOrders` to enforce per-block cap via `MaxStatefulOrderRemovalsPerBlock`; adds early-break logic for batching across blocks
Go types and interfaces `protocol/x/clob/types/block_limits_config_keeper.go`, `clob_keeper.go`, `keys.go`	New `BlockLimitsConfigKeeper` interface, `BlockLimitsConfig_Default` constant, `BlockLimitsConfigKey` state key; updated `ClobKeeper` to embed `BlockLimitsConfigKeeper`
CLI and query `protocol/x/clob/client/cli/query.go`, `query_block_limits_configuration.go`	New CLI command `CmdGetBlockLimitsConfiguration` integrated into query command set
EndBlocker telemetry `protocol/x/clob/abci.go`	Added telemetry gauge recording count of expired orders removed
Mocks and test support `protocol/mocks/ClobKeeper.go`, `QueryClient.go`	Added mock methods `GetBlockLimitsConfig`, `UpdateBlockLimitsConfig` on ClobKeeper; added `BlockLimitsConfiguration` mock on QueryClient; replaced `GetLeverage` mock; updated related liquidation/leverage mock methods
Tests `protocol/x/clob/keeper/stateful_order_state_test.go`, `grpc_query_block_limits_configuration_test.go`	Added comprehensive tests for cap-aware removal batching; added gRPC query handler test with success and nil-request error cases
App-level message registration `protocol/app/msgs/all_msgs.go`, `internal_msgs.go`, `internal_msgs_test.go`	Registered `MsgUpdateBlockLimitsConfig` and response in all-messages and internal-messages maps; updated test expectations
Ante handler `protocol/lib/ante/internal_msg.go`	Added `clob.MsgUpdateBlockLimitsConfig` to internal message type recognition
Module test `protocol/x/clob/module_test.go`	Updated expected interface registration count and query command set to reflect new block-limits-config command
GitHub Actions `.github/workflows/protocol-build-and-push.yml`	Added branch trigger for `anmol/eng-1296`

Sequence Diagram(s)

sequenceDiagram
    auctor ->> EndBlocker: Trigger
    EndBlocker ->> RemoveExpiredStatefulOrders: Remove expired orders
    Note over RemoveExpiredStatefulOrders: Check MaxStatefulOrderRemovalsPerBlock cap
    
    alt Cap enabled and reached
        RemoveExpiredStatefulOrders -->> RemoveExpiredStatefulOrders: Break early, return partial batch
        RemoveExpiredStatefulOrders ->> EndBlocker: Return expired IDs (up to cap)
    else Cap disabled or not reached
        RemoveExpiredStatefulOrders ->> RemoveExpiredStatefulOrders: Continue processing all
        RemoveExpiredStatefulOrders ->> EndBlocker: Return all expired IDs
    end
    
    EndBlocker ->> Telemetry: Record gauge (expired order count)
    EndBlocker ->> EndBlocker: Emit event

sequenceDiagram
    Client ->> CLI: query clob block-limits-config
    CLI ->> QueryClient: BlockLimitsConfiguration(request)
    QueryClient ->> gRPC: Call BlockLimitsConfiguration
    gRPC ->> ClobKeeper: BlockLimitsConfiguration(ctx, req)
    ClobKeeper ->> StateStore: GetBlockLimitsConfig
    StateStore -->> ClobKeeper: BlockLimitsConfig
    ClobKeeper -->> gRPC: QueryBlockLimitsConfigurationResponse
    gRPC -->> QueryClient: Response
    QueryClient -->> CLI: Response
    CLI -->> Client: Display config

sequenceDiagram
    Governance ->> MsgServer: UpdateBlockLimitsConfig(authority, config)
    MsgServer ->> Authority: Verify HasAuthority
    
    alt Authority valid
        Authority -->> MsgServer: OK
        MsgServer ->> ClobKeeper: UpdateBlockLimitsConfig(ctx, config)
        ClobKeeper ->> StateStore: setBlockLimitsConfig
        StateStore -->> ClobKeeper: Stored
        ClobKeeper -->> MsgServer: Success
        MsgServer -->> Governance: MsgUpdateBlockLimitsConfigResponse
    else Authority invalid
        Authority -->> MsgServer: ErrInvalidSigner
        MsgServer -->> Governance: Error
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

Areas requiring extra attention:

Cap enforcement logic in stateful_order_state.go: Verify the early-break condition correctly halts processing at MaxStatefulOrderRemovalsPerBlock and that the logic properly resumes in subsequent blocks.
TypeScript codegen re-mapping: The extensive alias re-mapping in bundle.ts and related files is mechanical but error-prone; confirm the new spreads correctly reference all required modules and don't introduce unintended symbol collisions.
Test coverage for batching: Ensure TestRemoveExpiredStatefulOrders_WithCap thoroughly exercises edge cases—orders at exactly the cap limit, orders spanning multiple time slices, and behavior when the cap is disabled (default).
Mock consistency: Verify that mock methods in ClobKeeper.go and QueryClient.go (especially the removal of GetLeverage and reorganization of leverage/liquidation mocks) align with updated keeper and query signatures.

Possibly related PRs

[CT-1160] add relevant protos to dydxprotocol #2162: Modifies dydxprotocol proto bundle exports and codegen re-mapping (similar alias reorganization pattern in bundle.ts).
chore: update rust proto with the latest changes #3060: Regenerates Rust protobufs to incorporate the protocol-level BlockLimitsConfig and related message/client additions introduced here.
Check Leverage On Order Placement #3141: Updates ClobKeeper-related interfaces and mocks in similar fashion, affecting keeper expectations and query client surfaces.

Suggested reviewers

teddyding
vincentwschau

Poem

🐰 Per-block we cap, no rush, no race,
Expired orders vanish at steady pace,
Batch by batch, no overflow fright,
BlockLimits config shines so bright! 🌟

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 10.53% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Description check	❓ Inconclusive	The PR description includes the required Changelist section explaining the purpose of the cap, but the Test Plan section is incomplete with only a placeholder, and the checklist items are unchecked.	Complete the Test Plan section describing how the cap functionality was tested, and consider checking relevant checklist items based on whether this change affects app state or requires backporting.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title clearly and specifically describes the main change: implementing a per-block cap on expired order removals, which is the core objective of this changeset.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch anmol/eng-1296

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e1c3947 and c369ff1.

📒 Files selected for processing (1)

protocol/x/clob/keeper/block_limits_config.go (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

protocol/x/clob/keeper/block_limits_config.go

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (42)

GitHub Check: (Mainnet) Build and Push ECS Services / call-build-and-push-vulcan / (vulcan) Build and Push
GitHub Check: (Mainnet) Build and Push ECS Services / call-build-and-push-auxo-lambda / (auxo) Build and Push Lambda
GitHub Check: (Mainnet) Build and Push ECS Services / call-build-and-push-bazooka-lambda / (bazooka) Build and Push Lambda
GitHub Check: (Mainnet) Build and Push ECS Services / call-build-and-push-ecs-service-roundtable / (roundtable) Build and Push
GitHub Check: (Mainnet) Build and Push ECS Services / call-build-and-push-ecs-service-socks / (socks) Build and Push
GitHub Check: (Mainnet) Build and Push ECS Services / call-build-and-push-ecs-service-comlink / (comlink) Build and Push
GitHub Check: (Mainnet) Build and Push ECS Services / call-build-and-push-ecs-service-ender / (ender) Build and Push
GitHub Check: (Public Testnet) Build and Push ECS Services / call-build-and-push-bazooka-lambda / (bazooka) Build and Push Lambda
GitHub Check: (Public Testnet) Build and Push ECS Services / call-build-and-push-auxo-lambda / (auxo) Build and Push Lambda
GitHub Check: (Public Testnet) Build and Push ECS Services / call-build-and-push-vulcan / (vulcan) Build and Push
GitHub Check: (Public Testnet) Build and Push ECS Services / call-build-and-push-ecs-service-roundtable / (roundtable) Build and Push
GitHub Check: (Public Testnet) Build and Push ECS Services / call-build-and-push-ecs-service-socks / (socks) Build and Push
GitHub Check: (Public Testnet) Build and Push ECS Services / call-build-and-push-ecs-service-ender / (ender) Build and Push
GitHub Check: (Public Testnet) Build and Push ECS Services / call-build-and-push-ecs-service-comlink / (comlink) Build and Push
GitHub Check: call-build-ecs-service-roundtable / (roundtable) Check docker image build
GitHub Check: call-build-ecs-service-socks / (socks) Check docker image build
GitHub Check: call-build-ecs-service-vulcan / (vulcan) Check docker image build
GitHub Check: call-build-ecs-service-comlink / (comlink) Check docker image build
GitHub Check: check-build-auxo
GitHub Check: check-build-bazooka
GitHub Check: call-build-ecs-service-ender / (ender) Check docker image build
GitHub Check: test / run_command
GitHub Check: unit-end-to-end-and-integration
GitHub Check: test-coverage-upload
GitHub Check: liveness-test
GitHub Check: test-race
GitHub Check: build-and-push-testnet
GitHub Check: golangci-lint
GitHub Check: check-sample-pregenesis-up-to-date
GitHub Check: run_command
GitHub Check: lint
GitHub Check: build
GitHub Check: build-and-push-mainnet
GitHub Check: benchmark
GitHub Check: container-tests
GitHub Check: Analyze (go)
GitHub Check: Analyze (javascript-typescript)
GitHub Check: Summary
GitHub Check: build-and-push-dev
GitHub Check: build-and-push-dev2
GitHub Check: build-and-push-dev4
GitHub Check: build-and-push-staging

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c462a7a and 9d44760.

📒 Files selected for processing (6)

.github/workflows/protocol-build-and-push.yml (1 hunks)
protocol/x/clob/abci.go (1 hunks)
protocol/x/clob/flags/flags.go (6 hunks)
protocol/x/clob/flags/flags_test.go (5 hunks)
protocol/x/clob/keeper/stateful_order_state.go (2 hunks)
protocol/x/clob/keeper/stateful_order_state_test.go (1 hunks)

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: hwray
Repo: dydxprotocol/v4-chain PR: 2597
File: indexer/services/ender/src/scripts/handlers/dydx_update_perpetual_v1_handler.sql:16-20
Timestamp: 2024-11-22T18:12:04.606Z
Learning: Avoid suggesting changes to deprecated functions such as `dydx_update_perpetual_v1_handler` in `indexer/services/ender/src/scripts/handlers/dydx_update_perpetual_v1_handler.sql` if they are unchanged in the PR.

Learnt from: anmolagrawal345
Repo: dydxprotocol/v4-chain PR: 2780
File: protocol/x/clob/keeper/twap_order_state.go:137-138
Timestamp: 2025-04-15T16:57:26.546Z
Learning: TWAP order cancellations and expiries will be handled in a dedicated follow-up PR, separate from the initial implementation of TWAP order functionality in the end blocker.

📚 Learning: 2025-04-15T16:58:46.335Z

Learnt from: anmolagrawal345
Repo: dydxprotocol/v4-chain PR: 2780
File: protocol/x/clob/keeper/twap_order_state_test.go:191-209
Timestamp: 2025-04-15T16:58:46.335Z
Learning: For TWAP orders, message validation in message_place_order.go handles edge cases with comprehensive checks for parameters like durations and intervals, making it unnecessary to test invalid parameters in twap_order_state_test.go which only receives pre-validated inputs.

Applied to files:

protocol/x/clob/keeper/stateful_order_state_test.go
protocol/x/clob/flags/flags_test.go

🧬 Code graph analysis (4)

protocol/x/clob/abci.go (4)

protocol/lib/metrics/lib.go (1)

SetGauge (38-40)

protocol/x/clob/types/keys.go (1)

ModuleName (6-6)

protocol/lib/metrics/metric_keys.go (1)

EndBlocker (91-91)

protocol/lib/metrics/constants.go (1)

Count (9-9)

protocol/x/clob/keeper/stateful_order_state.go (2)

protocol/app/flags/flags.go (1)

Flags (13-36)

protocol/x/clob/flags/flags.go (1)

MaxStatefulOrderRemovalsPerBlock (30-30)

protocol/x/clob/keeper/stateful_order_state_test.go (4)

protocol/testutil/constants/stateful_orders.go (5)

ConditionalOrder_Alice_Num0_Id0_Clob0_Buy5_Price10_GTBT15_StopLoss20 (457-470)

ConditionalOrder_Alice_Num1_Id0_Clob0_Sell5_Price10_GTB15 (932-943)

LongTermOrder_Alice_Num0_Id0_Clob0_Buy5_Price10_GTBT20 (131-142)

ConditionalOrder_Alice_Num1_Id1_Clob0_Sell50_Price5_GTB30 (944-955)

LongTermOrder_Alice_Num1_Id1_Clob0_Sell25_Price30_GTBT10 (191-202)

protocol/x/clob/memclob/memclob.go (1)

NewMemClobPriceTimePriority (57-66)

protocol/testutil/keeper/clob.go (1)

NewClobKeepersTestContext (61-73)

protocol/x/clob/flags/flags.go (1)

MaxStatefulOrderRemovalsPerBlock (30-30)

protocol/x/clob/flags/flags_test.go (1)

protocol/x/clob/flags/flags.go (2)

MaxStatefulOrderRemovalsPerBlock (30-30)

DefaultMaxStatefulOrderRemovalsPerBlock (44-44)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (32)

GitHub Check: (Mainnet) Build and Push ECS Services / call-build-and-push-vulcan / (vulcan) Build and Push
GitHub Check: (Mainnet) Build and Push ECS Services / call-build-and-push-auxo-lambda / (auxo) Build and Push Lambda
GitHub Check: (Public Testnet) Build and Push ECS Services / call-build-and-push-ecs-service-ender / (ender) Build and Push
GitHub Check: (Public Testnet) Build and Push ECS Services / call-build-and-push-bazooka-lambda / (bazooka) Build and Push Lambda
GitHub Check: (Public Testnet) Build and Push ECS Services / call-build-and-push-vulcan / (vulcan) Build and Push
GitHub Check: (Mainnet) Build and Push ECS Services / call-build-and-push-bazooka-lambda / (bazooka) Build and Push Lambda
GitHub Check: (Public Testnet) Build and Push ECS Services / call-build-and-push-ecs-service-roundtable / (roundtable) Build and Push
GitHub Check: (Mainnet) Build and Push ECS Services / call-build-and-push-ecs-service-socks / (socks) Build and Push
GitHub Check: (Public Testnet) Build and Push ECS Services / call-build-and-push-ecs-service-socks / (socks) Build and Push
GitHub Check: (Public Testnet) Build and Push ECS Services / call-build-and-push-ecs-service-comlink / (comlink) Build and Push
GitHub Check: (Mainnet) Build and Push ECS Services / call-build-and-push-ecs-service-ender / (ender) Build and Push
GitHub Check: (Public Testnet) Build and Push ECS Services / call-build-and-push-auxo-lambda / (auxo) Build and Push Lambda
GitHub Check: (Mainnet) Build and Push ECS Services / call-build-and-push-ecs-service-comlink / (comlink) Build and Push
GitHub Check: (Mainnet) Build and Push ECS Services / call-build-and-push-ecs-service-roundtable / (roundtable) Build and Push
GitHub Check: golangci-lint
GitHub Check: benchmark
GitHub Check: test-coverage-upload
GitHub Check: build
GitHub Check: liveness-test
GitHub Check: test-race
GitHub Check: unit-end-to-end-and-integration
GitHub Check: check-sample-pregenesis-up-to-date
GitHub Check: container-tests
GitHub Check: build-and-push-testnet
GitHub Check: build-and-push-mainnet
GitHub Check: Analyze (javascript-typescript)
GitHub Check: Analyze (go)
GitHub Check: Summary
GitHub Check: build-and-push-dev2
GitHub Check: build-and-push-dev4
GitHub Check: build-and-push-staging
GitHub Check: build-and-push-dev

protocol/x/clob/keeper/stateful_order_state.go

jusbar23 · 2025-11-10T19:55:47Z

.github/workflows/protocol-build-and-push.yml

      - main
      - 'release/protocol/v[0-9]+.[0-9]+.x'  # e.g. release/protocol/v0.1.x
      - 'release/protocol/v[0-9]+.x'  # e.g. release/protocol/v1.x
+      - 'anmol/eng-1296'


reminder to remove

jusbar23 · 2025-11-10T20:02:47Z

protocol/x/clob/types/errors.go

 		1022,
 		"Liquidation conflicts with ClobPair status",
 	)
+	ErrInvalidBlockLimitsConfig = errorsmod.Register(


Do we use this? Doesn't seem like we have any validation logic

deleted - the only validation logic lives in the server itself which verifies the message came from an authorized address. otherwise, think we can just use the value. dont need to check negative values either since its a uint

jusbar23 · 2025-11-10T20:03:50Z

protocol/x/clob/types/block_limits_config.go

+// Validate checks that the BlockLimitsConfig is valid.
+// Note: MaxStatefulOrderRemovalsPerBlock can be 0, which means no cap (process all expired orders).
+func (config *BlockLimitsConfig) Validate() error {
+	// No validation needed - 0 is a valid value meaning "no cap"


Do we want to set an upper bound for his?

dont think so since its governance protected. if for some reason we need to bump it past any threshold, we'll have to do an upgrade to support it

shrenujb · 2025-11-10T22:29:39Z

proto/dydxprotocol/clob/tx.proto


+// MsgUpdateBlockLimitsConfig is a request type for updating the block limits
+// configuration.
+message MsgUpdateBlockLimitsConfig {


Would this be a gov msg? Are there other limits we already have which we use gov msgs to update?

Yep this follows the same pattern as MsgUpdateLiquidationConfig which also seems gov protected.

northstar456 · 2025-11-12T16:42:14Z

proto/dydxprotocol/clob/block_limits_config.proto

+  // The maximum number of expired stateful orders that can be removed from
+  // state in a single block. This prevents performance degradation when
+  // processing a large number of expired orders.
+  uint32 max_stateful_order_removals_per_block = 1;


Is there a strong reason not to put this under BlockRateLimitConfiguration, which is also stored in state?

Not too concerned about the differences between this field (end blocker) and existing block rate limits (checktx)

Doing that would save us from so many stubs/new messages added in this PR

northstar456 · 2025-11-12T16:44:26Z

proto/dydxprotocol/clob/block_limits_config.proto

+  // The maximum number of expired stateful orders that can be removed from
+  // state in a single block. This prevents performance degradation when
+  // processing a large number of expired orders.
+  uint32 max_stateful_order_removals_per_block = 1;


For my context, why did we decide to add this recently? Is this an AI from a security report?

northstar456 · 2025-11-12T16:46:42Z

protocol/x/clob/keeper/stateful_order_state.go

+	numRemoved := uint32(0)
+
+	// Process orders up to the cap (if set), otherwise process all
 	for ; it.Valid(); it.Next() {


A bit cleaner to put condition in loop guard

maxRemovals := blockLimitsConfig.MaxStatefulOrderRemovalsPerBlock numRemoved := uint32(0) // Process orders up to the cap (if set), otherwise process all for ; it.Valid() && (maxRemovals == 0 || numRemoved < maxRemovals); it.Next() { var orderId types.OrderId k.cdc.MustUnmarshal(it.Value(), &orderId) expiredOrderIds = append(expiredOrderIds, orderId) store.Delete(it.Key()) numRemoved++ } return expiredOrderIds }

set cap on expired orders processed per block

9d44760

anmolagrawal345 requested a review from a team as a code owner November 5, 2025 19:36