You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a strong hunch/feeling that there is a way to simplify & improve Ponder's raw blockchain data caching design that, if possible, would pay dividends down the road.
In ponder.config.ts, the user can specify contracts that they’d like to index. Internally, Ponder converts each contract into a log filter with this type (simplified):
Consider a simple case where the user wants to fetch and index all events from a single contract, and has specified the start block as the contract deployment block number. Here's the resulting log filter:
Should be easy to follow. (Note: the log filter address and topics fields are exactly the same as the eth_getLogs parameters with the same names).
"Cached ranges"
Ponder’s sync service (the component that fetches and caches raw blockchain data) is largely organized around log filters. The database keeps track of which block ranges have been fetched and inserted into the store for very log filter. This is what powers the caching functionality, where if you finish the historical sync locally, then restart the ponder app, all the raw blockchain data is served from the cache.
The cache scheme is simply a database table with the schema:
After inserting a row, we then merge any rows with the same key that have any overlap. This actually works pretty well today!
The problem: Custom log filters & overlaps
To complicate things - Ponder also supports custom log filters where the user can specify the log filter directly. Consider now that the user from above has synced their initial simple 1-contract app. Then, they add a new custom log filter like this:
This log filter is actually a strict sub-set of the simple log filter. The first one matched ALL logs produced by the contract, this one only matches those where topic_0 == "0x1". So, the local raw blockchain store technically already has every log required to serve this. Unforunately, Ponder is not currently able to take advantage of this, and instead must refetch all the logs for the new log filter.
This is because the new log filter key looks like "1-0xabc-[0x1]" which is different from "1-0xabc-null".
Solutions
There might be a very simple solution here that I'm missing. It's basically a cache where the cache key is a set in a high dimension, and the components have slightly different rules (topic set logic, block range merging logic). Bit masks??
Ideally, the event store would have an API that looks like this:
The insert method would magically merge the newly inserted range with the existing ranges following the address/topic set logic and overlapping block range logic.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I have a strong hunch/feeling that there is a way to simplify & improve Ponder's raw blockchain data caching design that, if possible, would pay dividends down the road.
Background
Required reading: Ethers docs on topic sets.
Log filters
In
ponder.config.ts
, the user can specify contracts that they’d like to index. Internally, Ponder converts each contract into a log filter with this type (simplified):Consider a simple case where the user wants to fetch and index all events from a single contract, and has specified the start block as the contract deployment block number. Here's the resulting log filter:
Should be easy to follow. (Note: the log filter
address
andtopics
fields are exactly the same as theeth_getLogs
parameters with the same names)."Cached ranges"
Ponder’s sync service (the component that fetches and caches raw blockchain data) is largely organized around log filters. The database keeps track of which block ranges have been fetched and inserted into the store for very log filter. This is what powers the caching functionality, where if you finish the historical sync locally, then restart the ponder app, all the raw blockchain data is served from the cache.
The cache scheme is simply a database table with the schema:
During the historical sync, after we've fetched and cached a range of blocks for a given log filter, we insert a record in the database like this:
After inserting a row, we then merge any rows with the same key that have any overlap. This actually works pretty well today!
The problem: Custom log filters & overlaps
To complicate things - Ponder also supports custom log filters where the user can specify the log filter directly. Consider now that the user from above has synced their initial simple 1-contract app. Then, they add a new custom log filter like this:
This log filter is actually a strict sub-set of the simple log filter. The first one matched ALL logs produced by the contract, this one only matches those where
topic_0 == "0x1"
. So, the local raw blockchain store technically already has every log required to serve this. Unforunately, Ponder is not currently able to take advantage of this, and instead must refetch all the logs for the new log filter.This is because the new log filter key looks like
"1-0xabc-[0x1]"
which is different from"1-0xabc-null"
.Solutions
There might be a very simple solution here that I'm missing. It's basically a cache where the cache key is a set in a high dimension, and the components have slightly different rules (topic set logic, block range merging logic). Bit masks??
Ideally, the event store would have an API that looks like this:
The insert method would magically merge the newly inserted range with the existing ranges following the address/topic set logic and overlapping block range logic.
Beta Was this translation helpful? Give feedback.
All reactions