-
Notifications
You must be signed in to change notification settings - Fork 14
docs: Foreign chain transaction design doc #1920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
gilcu3
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as a first draft, left a few questions that might improve it in a second round
| ## Scope | ||
|
|
||
| - In scope: contract-level API for verify+sign requests, node-side verification via configured RPC providers, deterministic provider selection, and extensible per-chain verifiers. | ||
| - Out of scope: on-chain light clients / cryptographic proofs, multi-round MPC consensus on verification results, and non-ECDSA schemes for verify_foreign_transaction (initially ECDSA only). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as used below:
| - Out of scope: on-chain light clients / cryptographic proofs, multi-round MPC consensus on verification results, and non-ECDSA schemes for verify_foreign_transaction (initially ECDSA only). | |
| - Out of scope: on-chain light clients / cryptographic proofs, multi-round MPC consensus on verification results, and non-ECDSA schemes for `verify_foreign_transaction` (initially ECDSA only). |
| At a high level: | ||
|
|
||
| 1. A user submits a `verify_foreign_transaction` request with a chain-specific verification config. | ||
| 2. MPC nodes verify the foreign transaction via configured RPC providers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does this trust relation work? Does the node need to trust the RPC provider, or the information obtained can be itself verified against some ground truth, for example a fixed public key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we trust the RPC providers. But this trust is diluted by the fact that multiple RPC providers need to return the same result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, so colluding RPC providers could easily break the system. I forgot to mention, one thing I missed from the doc is a concrete attacker model, where this details would become explicit
|
|
||
| 1. A user submits a `verify_foreign_transaction` request with a chain-specific verification config. | ||
| 2. MPC nodes verify the foreign transaction via configured RPC providers. | ||
| 3. If verified, MPC signs `sha256(tx_id_bytes)` with the derived domain key and returns the signature on-chain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we planning to use this with existing domains?
I guess sha256(tx_id_bytes) is some application dependent step in the signature process, but for the application we will support initially it is the correct one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this will work with any existing ECDSA domains according to the current proposal. I'm a bit questioning if we should have dedicated domains for this or some sort of separation between this and the sign use case though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess there is a trade-off. Using existing domains is more efficient, simple and risky, but having a different domain is more cumbersome and secure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: We want to enforce separate domains for this. (discussed on Slack).
| // Contract DTOs | ||
| pub struct VerifyForeignTxRequestArgs { | ||
| pub chain: ForeignChain, | ||
| pub tx_id: TransactionId, // TxID is the payload we're signing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it strange that we call this transaction id, but at the same time it is the transaction payload
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, ForeignTransactionId would be a better name for this. This is the payload we're signing, but it's a transaction ID on another chain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so basically all we are signing is a transaction ID from another chain, not really an arbitrary payload?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes exactly. The feature is basically just signing a transaction ID right now. Though we might need to change that.
| pub chain: ForeignChain, | ||
| pub tx_id: TransactionId, // TxID is the payload we're signing | ||
| pub path: String, // Key derivation path | ||
| pub domain_id: Option<DomainId>, // Defaults to 0 (legacy ECDSA) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need the Option for legacy here? Is there any requirement about compatibility with the current sign method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good observation. Let's make this required.
|
|
||
| ### Node Configuration and Policy Updates | ||
|
|
||
| - Node config contains chain RPC providers and timeouts (API keys stay local). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I understand now. So the contract contains only the RpcProvider names, while the node store the actual details, so that they are detached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly. Good to see this is clear.
| - On startup, nodes compare local config to the on-chain policy. | ||
| - If different, a node submits a vote for the policy derived from its local config. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we expecting many policy changes? By this we are moving away from the current manual process to vote for anything, which is good, but at the same time will add complexity to the process, so there is a trade-off there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the reason for nodes to automatically vote is to ensure they have the actual configuration they are voting for. We originally considered having normal operator votes, but it felt like it would be easy for operators to forget updating their configuration before casting there votes for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what would happen if they don't? The node would simply not start? How would the operator change this config when the node is running in a TEE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Config updates in TEE is a bit of a pain point. The node would still operate even if it disagrees on the config, but I imagine it wouldn't collaborate on requests that it hasn't voted on.
I think we'll need to add better ways of dynamically updating node configuration once we've shipped the MVP of this feature.
| Each node selects a provider using a deterministic hash of: | ||
|
|
||
| ``` | ||
| hash = sha256(participant_id || request_id || provider_name) | ||
| ``` | ||
|
|
||
| Providers are sorted by this hash to build a deterministic ordering: | ||
|
|
||
| - **Primary provider** = first in the ordering. | ||
| - **Fallback** = subsequent providers in order. | ||
| - Each provider can include backup URLs for failover. | ||
|
|
||
| This ensures different nodes query different providers for the same request while preserving determinism. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not really describe completely how it avoids that two nodes query the same provider, and at the same time is not very efficient. We might want to iterate a bit on this before settling it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is a very lean and not exact formulation. I started working on expanding it but found a precise explanation to be a bit too much and I don't think this is important for the first iteration as it's not user-facing. The important part here is that it's possible to derive a consistent ordering and using this ordering to ensure nodes vote for different providers.
Basically if we have the providers P1 P2 P3 P4 and nodes N1 N2 N3 N4 N5, we'd have N1 -> P1, N2 -> P2, N3 -> P3, N4 -> P4, N5 -> P1.
| signing availability. | ||
| - **Finality semantics**: Finality definitions differ across chains; mapping them correctly is critical. | ||
| - **Operational friction**: Unanimous voting for policy updates may slow rollouts and hot fixes. | ||
| - **Config drift**: Nodes missing required provider keys will fail startup validation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this accurate? It didn't seem to be the case by the description above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'll remove it
| ## Discussion points | ||
| - Why do we return a signature? Can't we just return a bool. | ||
| - A signature suggests this is a "proof" that can be validated by someone else than the caller, but currently it seems like this proof could easily be forged by just calling the normal "sign" method. | ||
| - Finality interface right now diverges from the original PR. Are we okay with this new structure? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can avoid this collision by tweaking the key derivation properly
|
I'll merge this PR and apply all updates in a follow-up. I think it can be nice to track how this proposal has evolved in our version history. |
Closes #1918
This is a high level design document that highlights the most significant parts (as I see it) of the planned design for the upcoming foreign chain transactions feature.
This is to help anyone get up to speed on the current design and to make sure we can effectively and quickly break down, review and merge this feature while maintaining strong alignment on what we're doing.