-
Notifications
You must be signed in to change notification settings - Fork 375
CPS-0024? | Canonical CBOR Serialization #1109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 6 commits
4e49f06
6078531
19e6edc
f3c86d7
fae3905
7bc1fd3
0d8e4fc
6e54b4a
7f28dba
18dc2ff
f244d37
a6e419c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,137 @@ | ||||||||
| --- | ||||||||
| CPS: ? | ||||||||
| Title: Canonical CBOR Serialization Standard | ||||||||
rphair marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||
| Category: Tools | ||||||||
| Status: Open | ||||||||
| Authors: | ||||||||
| - Hinson Wong <[email protected]> | ||||||||
| - Tsz Wai Wu <[email protected]> | ||||||||
| Proposed Solutions: [] | ||||||||
| Discussions: | ||||||||
| - https://github.com/cardano-foundation/CIPs/pull/1109 | ||||||||
| Created: 2025-10-29 | ||||||||
| License: CC-BY-4.0 | ||||||||
| --- | ||||||||
|
|
||||||||
| ## Abstract | ||||||||
|
|
||||||||
| There is no canonical CBOR serialization standard in Cardano. While this is a delibrate design choice initially, standardizing it has growing popularity in Cardano developer community as evidenced by developer meetups such as Cardano Builder Fest 2025 hosted in Vietnam. This CPS outlines the motivation of the growing concern of fragmented CBOR serialization patterns across in the community. | ||||||||
|
|
||||||||
| ## Problem | ||||||||
|
|
||||||||
| <!-- A more elaborate description of the problem and its context. This section should explain what motivates the writing of the CPS document. --> | ||||||||
|
|
||||||||
rphair marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||
| The Cardano ledger accepts any valid CBOR encoding for transactions and on-chain data. While this flexibility was intentional to encourage ecosystem diversity, it has created significant interoperability challenges as the tooling landscape has matured. The same logical data can be encoded in multiple ways (map key ordering, integer encoding, definite vs. indefinite length, etc.), leading to different byte representations and transaction hashes. | ||||||||
|
|
||||||||
| ### Core Issues | ||||||||
|
|
||||||||
| **Transaction Hash Instability**: When a transaction is passed between tools or wallets for signing, each may re-serialize it differently. Since transaction hashes are computed over CBOR bytes, logically identical transactions produce different hashes. This breaks: | ||||||||
|
|
||||||||
| - Multi-signature workflows where each signer's wallet may re-serialize the transaction | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see this as problem in this CPS, since reserialzaition should never happen for the purpose of signing and if it is then it is because tooling is doing incorrectly. I would concider it a bug in the software regardless if canonical vs non-canonical serialization is used. |
||||||||
| - Cross-tool transaction building where fee calculations depend on exact byte size | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't quite follow this one. If there is any change to a transaction body then the bytes might change which could affect the fee. That is quite normal and expected. If you mean that some tooling might want to change some part of the transaction body that it itself should not change the size if canonical CBOR was used (eg. change a required signer, i.e. swap one hash for another), then I can see it as an argument, but I don't understand why such tooling couldn't just recompute the minimum fee? |
||||||||
|
|
||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
this is one Ive encountered, which was frustrating to figure out
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Ryun1 I did not put hardware wallets in since there is an active standard on it - https://cips.cardano.org/cip/CIP-0021. Perhaps you are referring to this one
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The biggest issue with ordering is that Haskell (or any other programming language) ordering is not guaranteed to always match the CBOR ordering. In other words, I suspect enforcing CBOR ordering would make Ledger implementation a bit more complicated. For example What is the ordering for a key that is a tuple?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The biggest issue with ordering is that Haskell (or any other programming language) ordering is not guaranteed to always match the CBOR ordering. In other words, I suspect enforcing CBOR ordering would make Ledger implementation a more complicated and error prone. For example What is the ordering for a key that is a tuple? Not saying it is impossible, but something that needs to be taken under. consideration |
||||||||
| **Script Inconsistencies**: Smart contracts suffer from unpredictable script hashes, reference script mismatches across tools. The same compiled script may produce different hashes depending on the library used to apply parameters or cbor serialize the script. | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't believe compiled scripts actually use CBOR serialization. @zliu41 will have a definitive answer, but from I know Plutus uses There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is correct. They're wrapped in a
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure how to describe it more precisely, the issue we faced is that the different
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@HinsonSIDAN plutus core is not using CBOR for serialization. So, this issue albeit looks related, it has nothing to do with CBOR! It has to do with the fact that there is no standard for plutus core serialization at all. In other words, if you'd like to use canonical or non-canonical CBOR for Plutus core serialization you could create a separate CIP for it, but I suspect that there will be some pushback there as well, since custom serialization that is currently in use is likely more efficient that CBOR. |
||||||||
|
|
||||||||
| **Development Friction**: Developers face increased testing burden across multiple libraries and wallets, library-specific test fixtures, vendor lock-in risks, and debugging challenges that require logical rather than byte-level comparison. | ||||||||
|
|
||||||||
| ### Ecosystem Impact | ||||||||
|
|
||||||||
| The lack of standardization creates: | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. CBOR is a standard, so this is not phrased correctly. |
||||||||
|
|
||||||||
| - **Security risks**: Hard-to-diagnose bugs and complex audit requirements due to multiple serialization paths | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In ledger we deal with serialization all the time and non of the bugs that we ran into had to deal with the fact that serialization is not canonical. I also never found debugging serialization to be that complicated.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In general I agree with most of the points you mentioned so far & I agree the standardization should not built in ledger, so when selecting category of this CPS, we decided to make it
btw you might not aware how far a distinct in technical knowledge amongst engineers in ledger team vs typical app builders😅 From tool builder perspective aka maintainer of Mesh where I face normal devs everyday, only less than 5% of devs understand how our CBOR system works and most of them can build DApps for Cardano
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I certainly have no say in this domain. If tooling developers in the community believe they can use another standard, I am not here to stop you. That being said, if you don't believe they can understand current standard, what makes you believe they will understand a new standard? |
||||||||
| - **Community overhead**: High support burden in addressing serialization issues and maintaining multiple strategies | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is only relative to all the tooling that cares about the current era, any tooling and the node itself will incur higher burden because of this CIP, because we do have to support current serialization forever while also introducing support for new serialization. A great example is Byron era, which actually tried to use canonical CBOR and failed. We are now stuck with that logic forever without ability to reuse almost any of it! |
||||||||
| - **Adoption barriers**: Unpredictable behavior discourages enterprise adoption and increases new developer friction | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand what is unpredictable in CBOR. It is a standard that 100% predictable. |
||||||||
|
|
||||||||
| This problem has become urgent as sophisticated DApps require cross-tool interoperability, multi-signature usage grows, and community feedback (e.g., Cardano Builder Fest 2025) has identified this as a critical pain point. | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It will take on the order of years for everyone to switch to canonical CBOR. |
||||||||
|
|
||||||||
| ## Use cases | ||||||||
|
|
||||||||
| <!-- A concrete set of examples written from a user's perspective, describing what and why they are trying to do. When they exist, this section should give a sense of the current alternatives and highlight why they are not suitable. --> | ||||||||
|
|
||||||||
rphair marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||
| **Cross-Library DApp Development**: A DApp developer builds transactions with Lucid in their frontend, but users sign with various wallets built on cardano-serialization-lib or pycardano. Canonical serialization ensures the transaction built equals the transaction signed. | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Canonical serialization does not ensure it. It is only when everyone involved implements implementation of canonical serialization correctly. |
||||||||
|
|
||||||||
| **Script Hash Consistency**: A developer publishes a reference script on-chain, then references it from their off-chain code. Currently, locally computed script hashes may not match the on-chain version due to encoding differences. Canonical serialization guarantees hash consistency across compilation and deployment pipelines. | ||||||||
|
|
||||||||
| **Library Maintainers**: Serialization library authors currently must support multiple encoding strategies for compatibility. With a standard, they can focus on a single canonical implementation, reducing maintenance burden and improving deserialization reliability. | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, serialization libraries will have to support both non-canonical for historical data and canonical CBOR for new data. So, IMHO serialization library authors will be impacted negatively by this CIP |
||||||||
|
|
||||||||
| ## Goals | ||||||||
|
|
||||||||
| <!-- A list of goals and non-goals a project is pursuing, ranked by importance. These goals should help understand the design space for the solution and what the underlying project is ultimately trying to achieve. | ||||||||
| Goals may also contain requirements for the project. For example, they may include anything from a deadline to a budget (in terms of complexity or time) to security concerns. | ||||||||
| Finally, goals may also serve as evaluation metrics to assess how good a proposed solution is. --> | ||||||||
|
|
||||||||
rphair marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||||||||
| ### Primary Goals | ||||||||
|
|
||||||||
| 1. **Establish a canonical CBOR standard**: A CIP that specifies deterministic encoding rules for all Cardano transaction and on-chain data structures, with clear guiding principles for choosing between encoding alternatives. | ||||||||
|
|
||||||||
| 2. **Achieve ecosystem adoption**: Widespread implementation across major serialization libraries (cardano-serialization-lib, pycardano, Lucid, Aiken, etc.) and wallets (Nami, Eternl, Lace, Yoroi, etc.), ensuring cross-tool interoperability. | ||||||||
|
|
||||||||
| 3. **Provide implementation guidance**: Comprehensive documentation including test vectors, reference implementations or validation tools, and migration paths for existing tooling. | ||||||||
|
|
||||||||
| ### Optional Goals | ||||||||
|
|
||||||||
| 4. **Ledger-level enforcement**: If community consensus supports it, implement validation rules in the ledger to guarantee compliance (requires hardfork and backward compatibility strategy). | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Deserialization in Ledger is not part of the Ledger rules. It is a totally separate stage, when compared to transaction validation.
Suggested change
|
||||||||
|
|
||||||||
| ### Success Criteria | ||||||||
|
|
||||||||
| This CPS is successfully resolved when: | ||||||||
|
|
||||||||
| - A canonical CBOR serialization CIP reaches "Active" status with clear specifications | ||||||||
| - At least 80% of major libraries and wallets demonstrate compliance | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a way to quantify major libraries today?
Suggested change
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, indeed this CPS is not intended to affect ledger
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, there is a section about it:
As I already pointed out, I would not want this CPS to affect Ledger either. I am a bit skeptical about standards that aren't enforced by the chain itself, but there are standards like these that proved themselves to work. So, at the very least, I would suggest adding all the drawbacks that I've mentioned to the section that suggests that this "standard should be enforced by the ledger level" |
||||||||
| - Serialization-related issues in community support channels decrease measurably | ||||||||
| - Cross-tool transaction building becomes reliably predictable | ||||||||
|
|
||||||||
| ### Requirements for Solutions | ||||||||
|
|
||||||||
| A good solution must: | ||||||||
|
|
||||||||
| - **Technical clarity**: Unambiguous encoding rules for all covered data structures | ||||||||
| - **Guiding principles**: Clear rationale for choosing specific encodings (efficiency, simplicity, adoption) | ||||||||
| - **Comprehensive scope**: Address transactions, scripts, datums, redeemers, and specify what is out-of-scope | ||||||||
| - **Path to Active**: Detailed adoption strategy including timeline, stakeholder coordination, and migration tooling | ||||||||
| - **Evolution mechanism**: Process for handling future hardforks and new ledger types | ||||||||
| - **Verification**: Test vectors or validation tools to verify implementation compliance | ||||||||
|
|
||||||||
| ## Open Questions | ||||||||
|
|
||||||||
| ### What are the guiding principles for choosing the canonical form? | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If a custom canonical CBOR standard is to be designed then there is much higher chance that it will be implemented correctly by all of the tools. Grabbing an existing standard and accepting any potential drawback it could have (eg. non-optimal size) would be a safer bet IMHO. |
||||||||
|
|
||||||||
| When multiple valid CBOR encodings exist, how should we decide which becomes canonical? | ||||||||
|
|
||||||||
| - **Efficiency**: Minimize transaction size (e.g., smallest integer encoding, definite over indefinite length) | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Definite and indefinite length encoding are have more efficient depending on number of elements. When there are less than 23 elements in an array definite length encoding is more efficient, while large count benefits from indefinite length encoding. |
||||||||
| - **Simplicity**: Choose the most straightforward encoding to implement and verify | ||||||||
| - **Existing adoption**: Align with the most widely-used pattern in current tooling (e.g., cardano-serialization-lib as de facto standard) | ||||||||
| - **Trade-offs**: How should we balance these potentially conflicting dimensions? | ||||||||
|
|
||||||||
| ### Should the standard be enforced at the ledger level? | ||||||||
|
|
||||||||
| **Enforcing** (ledger validation): | ||||||||
|
|
||||||||
| - Pros: Guarantees compliance; eliminates ambiguity; strongest interoperability guarantee | ||||||||
| - Cons: Breaks backward compatibility; requires hard fork; existing tools and transactions may become invalid; potentially impractical migration burden | ||||||||
|
|
||||||||
| **Not enforcing** (off-chain standard only): | ||||||||
|
|
||||||||
| - Pros: No backward compatibility concerns; existing transactions remain valid; easier initial adoption | ||||||||
| - Cons: Voluntary compliance may be insufficient; fragmentation may persist; no guarantee of universal adoption | ||||||||
|
|
||||||||
| ### How should canonical serialization evolve with hardforks? | ||||||||
|
|
||||||||
| When a hardfork introduces new ledger types or transaction fields, the CBOR encoding for these new structures must be decided. This raises critical workflow questions: | ||||||||
|
|
||||||||
| **Pre-hardfork standardization**: Should the canonical encoding for new ledger types be specified as part of the hardfork proposal itself? This would prevent fragmentation but may slow down hardfork timelines. | ||||||||
|
|
||||||||
| **Implementation sequencing**: Should serialization libraries wait for a canonical standard to be ratified before implementing support for new ledger types? Or should they implement independently and risk creating incompatible encodings? | ||||||||
|
|
||||||||
| **Governance and responsibility**: Who should define the canonical encoding for new types? | ||||||||
|
|
||||||||
| - The team proposing the hardfork (e.g., IOG, Intersect ledger team)? | ||||||||
| - CIP editors through a formal proposal process? | ||||||||
| - Library maintainers through community consensus? | ||||||||
| - A designated standardization working group? | ||||||||
|
|
||||||||
| ## Copyright | ||||||||
|
|
||||||||
| This CPS is licensed under [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/legalcode). | ||||||||
Uh oh!
There was an error while loading. Please reload this page.