-
-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: node skips some epoch length txs making bridgeState corrupt #349
Conversation
Codecov Report
@@ Coverage Diff @@
## master #349 +/- ##
==========================================
+ Coverage 95.16% 95.17% +0.01%
==========================================
Files 72 72
Lines 1096 1099 +3
Branches 206 206
==========================================
+ Hits 1043 1046 +3
Misses 47 47
Partials 6 6
Continue to review full report at Codecov.
|
temporary using [email protected] which includes the fix leapdao/leap-core#155 |
…rupt Tendermint has a [build-in replay protection](https://tendermint.com/docs/app-dev/app-development.html#replay-protection) which will reject the same tx from being added to the mempool twice. It seems to work by comparing raw tx bytes. Should two transactions have the same raw bytes, the second one will not make it to the mempool. Now, EpochLength transaction in leap-core is merely a `[type,epochLength]` tuple. `Tx.epochLength(X)` will have the same raw bytes for the same `X`. Consider the following pretty valid transaction flow: ``` 1. Tx.epochLength(2) 2. Tx.epochLength(4) 3. Tx.epochLength(2) ``` The third transaction will be rejected by tendermint's replay protection. The node listens for root chain events, including `EpochLength` event. Everytime event comes, the node a) keeps record of the epoch length from event (in the `bridgeState.epochLengths` list). [bridgeState](https://github.com/leapdao/leap-node/blob/5195438c54b7c2670e7314d102b690255b4a203b/src/bridgeState.js#L135) b) creates `epochLength` transaction and relays it. [eventsRelay](https://github.com/leapdao/leap-node/blob/5195438c54b7c2670e7314d102b690255b4a203b/src/eventsRelay.js#L80) c) handles `epochLength` tx, once received through ABCI from tendermint. [checkEpochLength](https://github.com/leapdao/leap-node/blob/5195438c54b7c2670e7314d102b690255b4a203b/src/tx/applyTx/checkEpochLength.js#L10) The step (c) expects all the a,b,c steps always executed for every `EpochLength` event. However, if the node receives to `EpochLength(X)` events for the same epoch length, only steps (a) and (b) will be executed. Step (c) would never happen for the second event, because the transaction created on (b) will be rejected by Tendermint's replay protection. In the end, we have `bridgeState.epochLength` mutated by (a), which will make `checkEpochLength` fail for any further epoch length tx. Add more differentiation to EpochLength transactions. I suggest adding blockHeight to it.
a1aa10f
to
3158b23
Compare
Holy shit. I wonder how we ain't dead yet :D I have a feeling we should create the new tx type and rename the old one to «Tx.epochLength_legacy» or something like that. If this PR will merged it'll make sync with testnet or mainnet impossible — Tx.fromRaw will fail with |
great catch, totally overlooked this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have almost identical (in terms of code) tx type MIN_GAS_PRICE
and It affected by the same problem. We can create the new issue or fix here — up to you
const { epochLength } = event.returnValues; | ||
const tx = Tx.epochLength(Number(epochLength)); | ||
const tx = Tx.epochLength(Number(epochLength), Number(event.blockNumber)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can save blockNumber in bridgeState.epochLengths
as well and use it for tx validation 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't get the idea. You propose to use these blockNumbers here when creating epochLength tx ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In bridgeState we store all epochLength changes. We can add blockNumber into this array ([epochLength, blockNumber][]
) and use it in this check
Btw, do not forget to update check anyway — it should handle two tx types (it handles zero currently :-))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm.. I've updated the check.
I don't see why we need to store blockNumber in bridgeState — we can just take it from the transaction (like I do in my code). If there is no blockNumber in epochLength, chance are that checkEpochLength
won't be called at all, exactly the issue we are fixing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or you mean, we can compare height from bridgeState with the one from tx ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes :-)
👍 |
* fix: add blockHeight to EpochLength tx to make them unique. See leapdao/leap-node#349 for rationale * 0.36.2-beta * add new tx type EPOCH_LENGTH_V2. Use it for epoch length txs with block height, keep the EPOCH_LENGTH type as it is to support legacy transactions (so we can replay the old networks) * 0.37.0-beta.1 * Tx.epochLength: make params mandatory * 0.37.0-beta.2
addressed with new tx type. I've quickly checked with testnet sync — the sync goes beyond the height 4, no error. |
1376ef2
to
1063dce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requires: leapdao/leap-core#155, #348
Resolves #340
Test to reproduce with: leapdao/leap-sandbox#69
Tendermint replay protection
Tendermint has a build-in replay protection which will reject the same tx from being added to the mempool twice. It seems to work by comparing raw tx bytes. Should two transactions have the same raw bytes, the second one will not make it to the mempool.
Now, EpochLength transaction in leap-core is merely a
[type,epochLength]
tuple.Tx.epochLength(X)
will have the same raw bytes for the sameX
. Consider the following pretty valid transaction flow:The third transaction will be rejected by tendermint's replay protection.
How it affects leap-node
The node listens for root chain events, including
EpochLength
event. Everytime event comes, the nodea) keeps record of the epoch length from event (in the
bridgeState.epochLengths
list). bridgeStateb) creates
epochLength
transaction and relays it. eventsRelayc) handles
epochLength
tx, once received through ABCI from tendermint. checkEpochLengthThe step (c) expects all the a,b,c steps always executed for every
EpochLength
event. However, if the node receives twoEpochLength(X)
events for the same epoch length, only steps (a) and (b) will be executed. Step (c) would never happen for the second event, because the transaction created on (b) will be rejected by Tendermint's replay protection. In the end, we havebridgeState.epochLength
mutated by (a), which will makecheckEpochLength
fail for any further epoch length tx.Proposed solution
Add more differentiation to EpochLength transactions. I suggest adding blockHeight to it. See: leapdao/leap-core#155
These two things needs to be merged first for the fix to work: leapdao/leap-core#155, #348