Skip to content

Conversation

@ezeike
Copy link
Contributor

@ezeike ezeike commented Jan 6, 2026

Summary

  • Add controller reference to RCManager for state access
  • Add buildIndexerSnapshot method to create state snapshots
  • Define IndexerSnapshot and DexPrice proto messages for efficient binary serialization over WebSocket
  • Extract SellOrder.IsLocked, SellOrder.Copy, OrderBook.Copy to swap_ext.go to prevent loss during proto regeneration

Changes

  • cmd/rpc/sock.go - WebSocket snapshot building logic
  • lib/.proto/block.proto - Proto message definitions
  • lib/swap_ext.go - Extracted swap helper methods

Test plan

  • Verify WebSocket connections receive IndexerSnapshot messages
  • Confirm proto regeneration doesn't overwrite swap_ext.go methods

ezeike and others added 22 commits December 23, 2025 18:34
Refactor EthBlockProvider architecture and add comprehensive test coverage

- Refactor EthBlockProvider to use centralized configuration struct
- Add calculateFetchRange function with proper height management and reorg protection
- Simplify processBlocks to use internal safeHeight state
- Add comprehensive table-driven tests for calculateFetchRange function
- Update TestEthBlockProvider_processBlocks to match simplified function signature
- Add transaction data size validation with 1024 byte limit
- Improve error handling and retry logic for transaction processing
- Add extensive documentation and flow diagrams
- Enhance oracle state management and validation
- Add new CLI query commands and RPC endpoints
- Add debug logs for RPC and WebSocket connection closing
- Add info log before connection retry with delay duration
- Add error log when websocket client is not initialized
- Add error log when subscription error is received
- Add info log when oracle is not enabled
Add 18 new Prometheus metrics for oracle monitoring across 5 categories:

Block height metrics:
- canopy_oracle_last_processed_height
- canopy_oracle_confirmation_lag
- canopy_oracle_orders_awaiting_confirmation
- canopy_oracle_reorg_rollback_depth (histogram)

Order lifecycle metrics:
- canopy_oracle_orders_not_in_orderbook_total
- canopy_oracle_orders_duplicate_total
- canopy_oracle_orders_archived_total
- canopy_oracle_lock_orders_committed_total
- canopy_oracle_close_orders_committed_total

Validation failure metrics:
- canopy_oracle_validation_failures_total (labeled by reason)

Submission tracking metrics:
- canopy_oracle_orders_held_awaiting_safe_total
- canopy_oracle_orders_held_propose_delay_total
- canopy_oracle_orders_held_resubmit_delay_total
- canopy_oracle_lock_order_resubmissions_total
- canopy_oracle_close_order_resubmissions_total

Store operation metrics:
- canopy_oracle_store_write_errors_total
- canopy_oracle_store_read_errors_total
- canopy_oracle_store_remove_errors_total

Also fixes test to use nil metrics (methods are nil-safe).
Add 23 new Prometheus metrics for better observability of Ethereum
block provider operations:

High Priority - Connection & Sync:
- RPC/WS connection attempts and errors
- Connection state tracking (disconnected/connecting/rpc/fully_connected)
- Sync status (unsynced/syncing/synced)
- Block height lag

High Priority - Block Processing:
- Block fetch errors by type
- Processing timeouts and batch sizes
- Reorg detection

Medium Priority - Transaction Processing:
- Transactions total, parse errors, retry by attempt
- Exhausted retries, success status breakdown
- Receipt fetch errors

Medium Priority - Order Detection & Token Cache:
- ERC20 transfer, lock/close order detection
- Order validation errors by type
- Token info fetch errors and contract call timeouts
Add three new gauge metrics for tracking Ethereum block heights:
- canopy_eth_chain_head_height: latest block from chain head
- canopy_eth_last_processed_height: last block successfully processed
- canopy_eth_safe_height: blocks sent through channel (confirmed)

These complement the existing block_height_lag metric for better
visibility into sync progress and block processing state.
Document all Prometheus metrics for the oracle system including:
- Oracle metrics: block heights, order lifecycle, validation, submissions
- Eth metrics: connection, sync, block processing, transactions, orders

Includes example Prometheus queries, alerting recommendations,
and Grafana dashboard tips for effective monitoring.
Apply gofmt formatting fixes:
- Reorder imports alphabetically
- Align struct field assignments
- Normalize comment list indentation
- Add missing trailing newlines
- Remove extra blank lines
Change lock order safe height check from Debug to Info for better
visibility when orders are being held back due to confirmation
requirements.
- Update shouldSubmit() to log specific hold reasons at Info level
- Log propose delay with blocks needed until eligible
- Log resubmit delay with exact eligibility root height
- Log lock order cooldown with blocks remaining
- Log already submitted cases for both lock and close orders
- Remove redundant logs from WitnessedOrders that assumed wrong reason
Rename the CanopyOrders RPC endpoint and related code to OracleOrders
for better clarity about the endpoint's purpose.

- Route path: /v1/query/canopy-orders -> /v1/query/oracle-orders
- CLI command: canopy-orders -> oracle-orders
- Handler, client method, and response type renamed accordingly
Add new /v1/query/indexer-snapshot endpoint that combines data from
multiple endpoints into a single response for indexers. Includes:

- Block, transactions, and events data
- Accounts, orders, DEX prices, params, supply
- Committees data, subsidized/retired committees
- Change detection pairs (current + H-1) for validators, pools,
  non-signers, double-signers, and DEX batches

Errors in individual fetches return null instead of failing the
entire request for resilience.
- Add controller reference to RCManager for state access
- Add buildIndexerSnapshot method to create state snapshots
- Define IndexerSnapshot and DexPrice proto messages for efficient
  binary serialization over WebSocket

🏷️ feat: add IndexerSnapshot proto message and extract swap methods

- Define IndexerSnapshot message in block.proto for WebSocket streaming
- Extract SellOrder.IsLocked, SellOrder.Copy, OrderBook.Copy to swap_ext.go
  to prevent loss during proto regeneration
@ezeike ezeike closed this Jan 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants