-
Notifications
You must be signed in to change notification settings - Fork 1
add round-trip for share derived via interpolation #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
BenWestgate
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
concept ACK
I agree we should do something to prevent this unexpected behavior but it's due to unintuitive mathematical properties of finite fields.
The reason is GF(32) interpolation does not preserve padding, it operates on full 5-bit values. So you storing bytes and throwing away the padding means you don't have enough information to construct the same share you extracted bytes from.
You MUST pass padding to reconstruct a derived string (one produced by interpolation).
I don't know how to enforce that at the library level, any ideas? Especially when we both agree a default is also a nice feature, but it foot guns here.
| assert d.s == "ms13k00ldp4v5nw8lph96x47mjxzgwjexe44p32swkq99e0w" | ||
|
|
||
| # now round-trip d share ('d' is derived via interpolation, NOT via 'from_seed') | ||
| dd = Codex32String.from_seed(d.data, "ms13k00ld") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't do this. You can only .from_seed without passing pad_val for the k initial strings, derived strings MUST be passed padding to round-trip..
You needed to be able to do this:
dd = Codex32String.from_seed(d.data, "ms13k00ld", d.pad_val)This version's Codex32String lacks a pad_val property, I'm working on an update which does.
No matter what padding style we use, since it's less than a full 5-bit value, so not in field GF(32), it will not interpolate into derived shares and maintain any linear relationship that allows round-tripping from bytes, GF(256), to GF(32) interpolated strings without passing the padding.
The only string you should care about data of after construction is "s" so the fact other share index values can return data is more of a curiosity and maybe .data should Raise InvalidShareIndex or return None if share_idx != "s" to this misuse.
What is your exact use case where you really need to store ALL the shares as bytes and recover back to codex32?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm able to do this which fixes this test case:
dd = Codex32String.from_seed(d.data, "ms13k00ld", pad_val=1)
but I have no idea how did I get to the pad_val=1 besides grinding it against the string which I already know (which won't be the case in real life)
I don't know how to enforce that at the library level, any ideas?
not really... besides grinding correct pad_val right after construction of derived share via round-trips (very meh)
What is your exact use case where you really need to store ALL the shares as bytes and recover back to codex32?
So my general idea is that I can use individual shares as normal secrets, load them on HWW, sign with them, etc. For instance user uses one HWW device to do the shamir split, while having N devices ready to export generated/derived shares as QR codes for instance. Load these derived shares on devices and geo-distribute the devices. These then serve as decoy, fully functional signers. When S secret is needed user just collect K devices & does some QR scanning to recover the S on empty HWW.
For this I thought I can use this from_seed/to_seed round-trips. Secure element storage is limited so for me byte encoding is more desired instead of u5.
But now, it seems this was never intended purpose of the non-secret shares, which seems more as just recovery tools, aka data with one and only one purpose - to recover share S (which is kind of pity tbh). Am I reading this correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think that if round-trips with derived shares can be achieved somehow, even if passing padding is necessary, it should be desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I have no idea how did I get to the
pad_val=1besides grinding it against the string which I already know (which won't be the case in real life)
You had to grind it because you discarded the pad_val. You might recover a different last data character if you don't know the last character without padding. interpolate_at operates on 5-bit values not bytes.
any ideas?
not really... besides grinding correct
pad_val... (very meh)
It may be possible to do it if you give up being able construct "non-encoded" shares from bytes data and instead accept construction of a Codex32ShareSet object with a from_bytes (or from_seeds) factory. And then use an interpolate_at(share_idx) method of that share set object.
What is your exact use case...?
generated/derived shares as QR codes for instance.
Make sure to skim this compact CodexQR discussion before speccing a QR design, it's the analog of compact SeedQR. I found a fun way to fit 128-bit codex32 share data into 21x21 QR codes by dropping some of the identifier.
Whatever solution we find for Codex32ShareSet.from_bytes(header, dict) would be very helpful there, as well as here.
These then serve as decoy, fully functional signers.
This seems useful!
For this I thought I can use this from_seed/to_seed round-trips.
You may be able to round trip the share set from_seeds/to_seeds or .data of individual shares but we need to define the correct Codex32ShareSet from_seeds class method to make this possible.
The source of truth in a Codex32ShareSet should be the common header and the byte payloads of "s", "a", "c" for k = 3 or maybe "a", "c", "d". CRC padding, which does not interpolate, is slightly more useful on a share you can actually find and verify it on, than trying to interpolate to an unknown share to check if it validates.
Secure element storage is limited so for me byte encoding is more desired instead of u5.
A 21x21 QR has only 137.2 bits if using base45 alphanumeric encoding, 138.2 bits if also using kanji, bytes and numeric modes. So it'd be excellent for us to define a compact encoding of share data. The bare minimum needed to always recover the correct secret and with what's left: prevent user errors.
But now, it seems this was never intended purpose of the non-secret shares, which seems more as just recovery tools, aka data with one and only one purpose - to recover share S (which is kind of pity tbh). Am I reading this correctly?
Yes, this is not their intended purpose but they do contain randomness and I think your idea is a cool and efficient use of that otherwise wasted random data needed for SSS so worth pursuing IF it can be done securely (not revealing any more info about "s" than, at most, its padding bits with k-1 shares.)
I also think that if round-trips with derived shares can be achieved somehow, even if passing padding is necessary, it should be desired.
I agree. The solution to recover seeds from bytes alone is non-trivial but it should exist, lets find it. You'll find this bytes vs 130-bits question tripped up Andrew in the QR discussion, it's always surprising how padding behaves as the finite field changes.
from_seed/to_seed)