-
Notifications
You must be signed in to change notification settings - Fork 11
Stable Rust changes #209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Stable Rust changes #209
Conversation
This PR can potentially have some parts rejected, so if any of the commits (or smaller parts) seem unnecessary, we can discuss. I don’t always do a good job of including justification in the commit messages. E.g., the change to have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flushing comments; I still need to spend some more time with 05ba17b to make sure I've checked everything.
* | ||
* This function is consensus-critical since BIP66. | ||
*/ | ||
fn is_valid_signature_encoding(sig: &[u8]) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have verified that ecdsa::Signature::from_der
performs these checks, with the exception that I could not find the minimum and maximum size checks. @sellout where is the overall length check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The min/max check is redundant – it’s a consequence of the other tags & encoded lengths. In particular, it’s a consequence of the r
and s
lengths each being in the range [1, 32].
I think this removed code never checked that the the lengths are ≤32, but rust-secp256k1 does. So, there was actually a shortcoming here (AFAICT) where, for example, there could have been an r_len
of 48 and an s_len
of 16, and everything would have passed.
}; | ||
|
||
// A signature is of type 0x30 (compound). | ||
if sig[0] != 0x30 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if sig[0] != 0x30 { | ||
return false; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if usize::from(sig[1]) != sig.len() - 3 { | ||
return false; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/opcode/mod.rs
Outdated
} | ||
|
||
#[derive(Clone, PartialEq, Eq, PartialOrd, Ord, Debug)] | ||
pub enum PushValue { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs rustdoc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any linter that tells you what declarations aren’t documented?
src/opcode/mod.rs
Outdated
} | ||
|
||
#[derive(Copy, Clone, PartialEq, Eq, PartialOrd, Ord, Debug)] | ||
pub enum Operation { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs top-level rustdoc; basically, all the public functions and types do.
src/scriptnum.rs
Outdated
const DEFAULT_MAX_NUM_SIZE: usize = 4; | ||
|
||
pub fn new( | ||
vch: &Vec<u8>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it could take a &[u8]
instead?
src/scriptnum.rs
Outdated
Self( | ||
self.0 | ||
.checked_add(other.0) | ||
.expect("caller should avoid overflow"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the caller has to check, why not make the Add
and Sub
impls return type Output = Result<Self, ...>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These operations went away with #208.
src/opcode/mod.rs
Outdated
|
||
use operation::{Control, Normal}; | ||
use push_value::{LargeValue, SmallValue}; | ||
|
||
/** Script opcodes */ | ||
#[derive(Clone, PartialEq, Eq, PartialOrd, Ord, Debug)] | ||
#[derive(Clone, PartialEq, Eq, PartialOrd, Ord, Debug, Deserialize, Serialize)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I wonder whether it doesn't make more sense to write a custom Deserialize
and Serialize
that uses the canonical encoding, instead of implicitly using a new (derived) serde
encoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think that definitely makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrmm, actually, LargeValue
puts a wrinkle in this, as it’s not representable by u8
.
We could have Operation
(including Control
, Normal
, and Unknown
) and SmallValue
map to u8
, but LargeValue
(and consequently PushValue
and Opcode
) would need something richer ([u8]
?). And then the larger script types definitely to [u8]
.
6d6ad7f
to
1a4c5c2
Compare
3064d79
to
fd74615
Compare
50c70be
to
44109e5
Compare
Sorry, there has been a lot of churn here after addressing @nuttycom’s comments. Lesson learned – this PR is just too big. I ended up removing the Everything is passing now, though, so should be good. |
I am splitting this PR into multiple smaller ones. So far I’ve split out four preliminary PRs, reducing this PR from 36 commits to 19.
graph LR;
PR-221 --> PR-223;
PR-222 --> PR-223;
PR-223 --> PR-224;
PR-224 --> this;
|
44109e5
to
020e72a
Compare
The C++ implementation had a lot of manual checking of bytes. In Rust, we have the secp256k1 crate, which takes care of most of this.
(cherry picked from commit 6082693)
On the C++ side, this is the value of the error output parameter when script validation is successful. It never occurs on the Rust side (because we have `Result`), and so we can remove it from the enumeration without an consequences.
Previously, the `TestVector`s held normalized results, and we would normalize the _actual_ result before comparison. This changes the `TestVector`s to hold richer Rust results, and then to normalize the _expected_ result only for the C++ case.
This splits `Operation` into three separate enums – `Control` (for if/else, which get executed regardless of `vexec` and disabled operations, which always fail, regardless of `vexec`), `Normal`, for operations that respect `vexec`, and `Unknown` for undefined opcodes which only fail if they’re on an active branch. This is done so that the evaluator can be split on the same lines. __NB__: I strongly recommend ignoring whitespace changes when reviewing this commit. (cherry picked from commit f97b92c)
(cherry picked from commit 27a5037)
This parallels the existing `op` module, and is used in cases where we want to guarantee that only push values are used. `op` itself has been updated to reference `pv`, rather than using the constructors directly.
This introduces one edge case: If a disabled opcode is the 202nd operation in a script, the C++ impl would return `Error::OpCount`, while the Rust impl would return `Error::DisabledOpcode`.
Having `Control` and `Normal` grouped under `Operation` only eliminated one conditional (checking whether we hit the `op_count` limit, and that is now better abstracted anyway), and it introduced a lot of code (see the 55 lines removed here _plus_ the many nested calls removed, as in op.rs). So, `Normal` is now called `Operation`. The confusing label of “control” for some `Operation` (née `Normal`) opcodes has been removed. (cherry picked from commit 1c98bb3)
Well, my _tiny_ edge case of “only if the 202nd operation is a disabled opcode” didn’t slip past the fuzzer. It caught that pretty quickly. So, this does a better job of normalizing errors for comparisons. First, it normalizes both the C++ and Rust side, which allows the Rust error cases to not be a superset of the C++ error cases. Then, it also normalizes errors in the stepwise comparator (which I think was done in ZcashFoundation#210, but it’s reasonable to do along with these other changes). Also, `ScriptError::UnknownError` has been replaced with `ScriptError::ExternalError`, which takes a string description. This is used for two purposes: 1. “Ambiguous” cases. One was already done – `UNKNOWN_ERROR` on the C++ side with `ScriptError::ScriptNumError` or `ScriptError::HighS` on the Rust side, but now it’s handled differently. The other is the edge case I introduced earlier in this PR – Rust will fail with a `DisabledOpcode` and C++ will fail with a `OpCount`, so those two cases have been unified. 2. Errors that occur inside a stepper. Previously, this just melded in by returning `UnknownError`, but that was just the “least bad” option. Now we can distinguish these from other `ScriptError`.
This makes it easier to write chunks of Zcash Script in code, with some higher-level patterns of opcodes to use. The serialization is also necessary for wallets to produce scripts. It also removes some of the duplication in the tests, by moving constants to `crate::testing`.
It now calls `PushValue::from_slice`, which is more correct and user- visible.
This required changing the `result` type of `TestVector`, because now there are some values that can’t be constructed directly, so we store those in normalized form.
This should move things around without making any changes to the implementation (other than updating names). Here is what changed: - `script::Opcode` is now `::Opcode` - `Control` `Disabled`, `Operation`, and `PushValue` have moved from `script` to `opcode` - `LargeValue` and `SmallValue` have moved from `script` to `opcode::push_value` - `LOCKTIME_THRESHOLD` maved from `script` to `interpreter` - `script::Script` is now `script::Code` (this is in anticipation of typed scripts, which introduces a `struct Script<T> { sig: script::Sig<T>, pub_key: script::PubKey }`) - `script::parse_num` is now `num::parse` - `script_error::ScriptNumError` is now `num::Error` - `script::MAX_SCRIPT_ELEMENT_SIZE` is now `opcode::PushValue::MAX_SIZE` - `script::serialize_num` is now `num::serialize` - `script::MAX_SCRIPT_SIZE` is now `script::Code::MAX_SIZE` - `script::Script::get_op` is now `::Opcode::parse` - `script::Script::get_lv` is now `opcode::push_value::LargeValue::parse`
This is needed by the PCZT crate
With stronger types come fewer errors. This prepares the types for cases when different public API calls have different possible errors. Notably, this separates lexing errors from interpreter ones).
This is one more step toward type-safe scripts.
Store them as their byte representations, rather than serde’s defaults.
`SignatureChecker` shouldn’t have had default implementations – those really belong to the `BaseSignatureChecker`.
And require that future ones are documented as well.
The `Stack` operations defined here already return `InvalidStackOperation` when they fail, so many of the explicit `stack.is_empty()` checks are redundant. However, removing them is delicate. Basically, they can only be removed if there is no other change to the state that can occur before the `Stack` operation would hit the same case. E.g., in `OP_2DROP`, it can’t be removed because if there is one element on the stack, it’ll be removed before the error is triggered, leaving the stack in different states between C++ and Rust. Similarly, `OP_WITHIN` _can_ have the check removed, but the `pop`s can’t replace the `rget`s because we might fail after the first `pop`, which would again leave us in a different state than C++ which reads all the values before popping any of them. Keeping the behavior identical to the C++ isn’t important per se, but it makes validating and debugging changes easier, as we expect them to have the same state in every case.
020e72a
to
5f9b7d1
Compare
It holds a full script – both sig & pubkey, in its un-parsed representation. This (along with `script::Code`) is also now used to represent scripts in the public interface. No more `&[u8]`.
5f9b7d1
to
bcfbdab
Compare
This contains a number of changes to the Rust implementation that maintain stepwise identical behavior with the C++ implementation.
It should be reviewed commit-by-commit.