Skip to content

Conversation

@drahnr
Copy link
Contributor

@drahnr drahnr commented Nov 28, 2025

Part 2/4 of #1185 and lays the ground for #1354

⚠️ The PR is rather large, a lot of the changes affect large pieces of code across.

Intent

  • Provide the baseline for cheap querying partial storage maps, but not implement it
  • Prepare for potential deprecation of AccountInfo

core changes

  1. Remove the vault / storage_map BLOB entries from the accounts table.
  2. Add SmtForest and integrate into apply_block and State::load

significant changes in the following files:

  • crates/store/src/db/schema.rs introduces account_storage_headers and removes storage (the full serialized account storage) from accounts table
  • crates/store/src/state.rs / fn State::apply_block now updates the database, but also the lookup table of roots for the SmtForest and the entries in the forest itself, indirect lookup tables

out of scope

how to review

Take the existing TODOs into consideration, if they make sense. This will be the follow up PR.

@drahnr drahnr force-pushed the bernhard-integrate-smtforest branch 2 times, most recently from 7980bed to 771f6d7 Compare December 2, 2025 11:16
@drahnr drahnr force-pushed the bernhard-integrate-smtforest branch from 771f6d7 to b2aa54b Compare December 2, 2025 15:22
@drahnr drahnr force-pushed the bernhard-integrate-smtforest branch from b2aa54b to 2964a93 Compare December 2, 2025 15:54
Copy link
Collaborator

@Mirko-von-Leipzig Mirko-von-Leipzig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly questions; no real issues found as yet

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I left some comments inline and pushed a few commits to try to organize the code a bit better. Overall, we are still quite far from being able to merge this PR as there are some bugs and logical inconsistencies.

Also, given the amount of remaining issues, I'm a bit concerned about our ability to wrap this PR up in a timely manner. Given the complexity of this PR, deep reviews take quite a bit of time. Maybe there is a way to split this up into smaller incremental PRs that we can review and merge much more quickly.

Comment on lines 1325 to 1330
let public_account_ids: Vec<AccountId> = db
.select_all_account_commitments()
.await?
.iter()
.filter_map(|(id, _commitment)| if id.has_public_state() { Some(*id) } else { None })
.collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty inefficient because we just need to get a list of accounts with public state, but instead we are loading all account commitments. This is OK for now because we'll replace the initialization code as we migrate to RocksDB-based backends.

Copy link
Contributor Author

@drahnr drahnr Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could filter as part of the query using lt/gt given a certain prefix range if the layout is advantageous, LIKE statements are not supported by diesel after taking a brief look.
EDIT: looking at the layout of the exact bits in AccountId and the prefix, it seems the pub/priv/network types are encoding it at the lower octet of the prefix, range filtering won't work.

Comment on lines +130 to +136
/// Read-write lock used to prevent writing to a structure while it is being used.
///
/// The lock is writer-preferring, meaning the writer won't be starved.
inner: RwLock<InnerState>,

/// Forest-related state `(SmtForest, storage_roots, vault_roots)` with its own lock.
forest: RwLock<InnerForest>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing a good reason to separate InnerForest from InnerState - so, let's combine them. This could look simply as:

struct InnerState<S = MemoryStorage>
where
    S: SmtStorage,
{
    blockchain: Blockchain,
    account_tree: AccountTreeWithHistory<S>,
    nullifier_tree: NullifierTree<LargeSmt<S>>,
    forest: InnerForest,
}

In the future (when moving to a RocksDB backend), we should refactor this to move InnerState into a separate file and clean up the module hierarchy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It takes care of encapsulating the indices required for lookup and hiding the internals, which will come in handy once LargeSmtForest lands, which has a largely different API.

That's not withstanding a separate module cleanup/InnerState move.

@drahnr drahnr changed the base branch from next to bernhard-db-schema-queries December 29, 2025 23:53
@drahnr drahnr changed the title feat: [1/3] integrate smtforest, avoid ser/de of full account/vault data in database feat: [2/4] integrate smtforest, avoid ser/de of full account/vault data in database Dec 29, 2025
@drahnr drahnr force-pushed the bernhard-integrate-smtforest branch from 3bd5451 to c5a199a Compare December 30, 2025 00:53
Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I reviewed most of the non-test code and left some small comments inline.

Also, my understanding is that in this PR, we are just updating the SMT forest, but actually getting data from it would be done in the next PR, right?

@drahnr
Copy link
Contributor Author

drahnr commented Jan 6, 2026

Also, my understanding is that in this PR, we are just updating the SMT forest, but actually getting data from it would be done in the next PR, right?

That is correct. Querying and populating partial queries/responses is done in 3/4

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thank you! I left some more comments inline - all of them should be pretty straight-forward. Once they are addressed, we should be good to merge.

Comment on lines +49 to +59
/// Retrieves the most recent vault SMT root for an account.
///
/// Returns the latest vault root entry regardless of block number.
/// Used when applying incremental deltas where we always want the previous state.
fn get_latest_vault_root(&self, account_id: AccountId) -> Word {
self.vault_roots
.range((account_id, BlockNumber::GENESIS)..)
.take_while(|((id, _), _)| *id == account_id)
.last()
.map_or_else(Self::empty_smt_root, |(_, root)| *root)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's mention in the doc comments that if a vault root for the specified account cannot be found, we return a root of an empty tree.

Also, looking at the implementation, I wonder if a better backing structure would be BTreeMap<AccountId, (BlockNumber, Word)>. This would make getting the last vault root much simpler.

Getting root for a specific block number should be pretty simple as well (we could use a binary search or even just a linear scan since we know that the number of entries will be pretty small per account ID).

Comment on lines 76 to 90
/// Retrieves the most recent storage map SMT root for an account slot.
///
/// Returns the latest storage root entry regardless of block number.
/// Used when applying incremental deltas where we always want the previous state.
fn get_latest_storage_map_root(
&self,
account_id: AccountId,
slot_name: &StorageSlotName,
) -> Word {
self.storage_roots
.range((account_id, slot_name.clone(), BlockNumber::GENESIS)..)
.take_while(|((id, name, _), _)| *id == account_id && name == slot_name)
.last()
.map_or_else(Self::empty_smt_root, |(_, root)| *root)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments as above - though, here the data structure would be BTreeMap<(AccountId, StorageSlotName), (BlockNumber, Word)>.

Comment on lines +157 to +161
let prev_root = if is_full_state {
Self::empty_smt_root()
} else {
self.get_latest_vault_root(account_id)
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be just:

let prev_root = self.get_latest_vault_root(account_id);

.map_or(0, |asset| asset.amount());

let new_balance = i128::from(prev_amount) + i128::from(*amount_delta);
u64::try_from(new_balance.max(0)).expect("balance fits in u64")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If new_balance somehow ends up negative, we should catch this. A panic could be fine for now, but ideally, it'd be an error.

Comment on lines +242 to +246
let prev_root = if is_full_state {
Self::empty_smt_root()
} else {
self.get_latest_storage_map_root(account_id, slot_name)
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to one of the above comments, this could probably be just:

let prev_root = self.get_latest_storage_map_root(account_id, slot_name);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants