Skip to content
137 changes: 137 additions & 0 deletions CPS-????/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
---
CPS: ?
Title: Canonical CBOR Serialization Standard
Category: Tools
Status: Open
Authors:
- Hinson Wong <[email protected]>
- Tsz Wai Wu <[email protected]>
Proposed Solutions: []
Discussions:
- https://github.com/cardano-foundation/CIPs/pull/1109
Created: 2025-10-29
License: CC-BY-4.0
---

## Abstract

There is no canonical CBOR serialization standard in Cardano. While this is a delibrate design choice initially, standardizing it has growing popularity in Cardano developer community as evidenced by developer meetups such as Cardano Builder Fest 2025 hosted in Vietnam. This CPS outlines the motivation of the growing concern of fragmented CBOR serialization patterns across in the community.

## Problem

<!-- A more elaborate description of the problem and its context. This section should explain what motivates the writing of the CPS document. -->

The Cardano ledger accepts any valid CBOR encoding for transactions and on-chain data. While this flexibility was intentional to encourage ecosystem diversity, it has created significant interoperability challenges as the tooling landscape has matured. The same logical data can be encoded in multiple ways (map key ordering, integer encoding, definite vs. indefinite length, etc.), leading to different byte representations and transaction hashes.

### Core Issues

**Transaction Hash Instability**: When a transaction is passed between tools or wallets for signing, each may re-serialize it differently. Since transaction hashes are computed over CBOR bytes, logically identical transactions produce different hashes. This breaks:

- Multi-signature workflows where each signer's wallet may re-serialize the transaction
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this as problem in this CPS, since reserialzaition should never happen for the purpose of signing and if it is then it is because tooling is doing incorrectly. I would concider it a bug in the software regardless if canonical vs non-canonical serialization is used.
Whenever a payload is given to a program for the purpose of signing, that payload should not be mocked with. Same applies to transactions, if all you need is to sign the transaction either for the purpose of multi-sig or any other, transaction hash and serializaiton should not be recomputed and the original bytes must be retained and that is what needs to be signed.

- Cross-tool transaction building where fee calculations depend on exact byte size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow this one. If there is any change to a transaction body then the bytes might change which could affect the fee. That is quite normal and expected.

If you mean that some tooling might want to change some part of the transaction body that it itself should not change the size if canonical CBOR was used (eg. change a required signer, i.e. swap one hash for another), then I can see it as an argument, but I don't understand why such tooling couldn't just recompute the minimum fee?


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Hardware wallets, which require the keys in every map to be sorted from lowest value to highest.

this is one Ive encountered, which was frustrating to figure out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ryun1 I did not put hardware wallets in since there is an active standard on it - https://cips.cardano.org/cip/CIP-0021. Perhaps you are referring to this one

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The biggest issue with ordering is that Haskell (or any other programming language) ordering is not guaranteed to always match the CBOR ordering.

In other words, I suspect enforcing CBOR ordering would make Ledger implementation a bit more complicated.

For example What is the ordering for a key that is a tuple?
What is bigger -1 or 10?
According to CBOR ordering that would be -1, while in Haskell it would be 10.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The biggest issue with ordering is that Haskell (or any other programming language) ordering is not guaranteed to always match the CBOR ordering.

In other words, I suspect enforcing CBOR ordering would make Ledger implementation a more complicated and error prone.

For example What is the ordering for a key that is a tuple?
What is bigger -1 or 10?
According to CBOR ordering that would be -1, while in Haskell it would be 10.

Not saying it is impossible, but something that needs to be taken under. consideration

**Script Inconsistencies**: Smart contracts suffer from unpredictable script hashes, reference script mismatches across tools. The same compiled script may produce different hashes depending on the library used to apply parameters or cbor serialize the script.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe compiled scripts actually use CBOR serialization. @zliu41 will have a definitive answer, but from I know Plutus uses flat library for serialization.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct. They're wrapped in a script type defined in the CDDL within a transaction, but the compiled scripts themselves are just flat encoded bytes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how to describe it more precisely, the issue we faced is that the different uplc libraries in different languages bahave differently. i.e. the Aiken uplc in Rust and HLabs uplc npm package implemented differently such that in Mesh users get different script cbor if using different cores. It is a massive devexp issue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

users get different script cbor if using different cores.

@HinsonSIDAN plutus core is not using CBOR for serialization. So, this issue albeit looks related, it has nothing to do with CBOR! It has to do with the fact that there is no standard for plutus core serialization at all.

In other words, if you'd like to use canonical or non-canonical CBOR for Plutus core serialization you could create a separate CIP for it, but I suspect that there will be some pushback there as well, since custom serialization that is currently in use is likely more efficient that CBOR.


**Development Friction**: Developers face increased testing burden across multiple libraries and wallets, library-specific test fixtures, vendor lock-in risks, and debugging challenges that require logical rather than byte-level comparison.

### Ecosystem Impact

The lack of standardization creates:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CBOR is a standard, so this is not phrased correctly.


- **Security risks**: Hard-to-diagnose bugs and complex audit requirements due to multiple serialization paths
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ledger we deal with serialization all the time and non of the bugs that we ran into had to deal with the fact that serialization is not canonical. I also never found debugging serialization to be that complicated.
Diagnosing logic bugs in ledger rules are are on order of magnitude more complicated than serialization, since by definition serialization does not have complication relationships within large types.
I don't buy that this is a security risk. I believe making a massive change of changing how we deal with serialization will open up the system to security risks because such transition have a huge potential for new bugs!

Copy link
Contributor Author

@HinsonSIDAN HinsonSIDAN Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I agree with most of the points you mentioned so far & I agree the standardization should not built in ledger, so when selecting category of this CPS, we decided to make it Tools but not Ledger. Simply introducing a standard where tool builders can follow can solve the issue also

I also never found debugging serialization to be that complicated.

btw you might not aware how far a distinct in technical knowledge amongst engineers in ledger team vs typical app builders😅 From tool builder perspective aka maintainer of Mesh where I face normal devs everyday, only less than 5% of devs understand how our CBOR system works and most of them can build DApps for Cardano

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply introducing a standard where tool builders can follow can solve the issue also

I certainly have no say in this domain. If tooling developers in the community believe they can use another standard, I am not here to stop you.

That being said, if you don't believe they can understand current standard, what makes you believe they will understand a new standard?

- **Community overhead**: High support burden in addressing serialization issues and maintaining multiple strategies
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only relative to all the tooling that cares about the current era, any tooling and the node itself will incur higher burden because of this CIP, because we do have to support current serialization forever while also introducing support for new serialization.

A great example is Byron era, which actually tried to use canonical CBOR and failed. We are now stuck with that logic forever without ability to reuse almost any of it!

- **Adoption barriers**: Unpredictable behavior discourages enterprise adoption and increases new developer friction
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what is unpredictable in CBOR. It is a standard that 100% predictable.


This problem has become urgent as sophisticated DApps require cross-tool interoperability, multi-signature usage grows, and community feedback (e.g., Cardano Builder Fest 2025) has identified this as a critical pain point.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will take on the order of years for everyone to switch to canonical CBOR.
Ledger has already the whole of 2026 workload planned out. Even in optimistic case it would be sometime in late 2027 that this CIP could hypothetically could be implemented and then a whole lot more time for all the tooling to catch up. So, if this issue has urgency to it, I'd recommend putting more effort in fixing the tooling, rather than rushing dangerous changes like this on the chain itself.


## Use cases

<!-- A concrete set of examples written from a user's perspective, describing what and why they are trying to do. When they exist, this section should give a sense of the current alternatives and highlight why they are not suitable. -->

**Cross-Library DApp Development**: A DApp developer builds transactions with Lucid in their frontend, but users sign with various wallets built on cardano-serialization-lib or pycardano. Canonical serialization ensures the transaction built equals the transaction signed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Canonical serialization does not ensure it. It is only when everyone involved implements implementation of canonical serialization correctly.
Considering the situation today that tooling cannot correctly implement a standard today, I wouldn't bet that canonical serialization that "the transaction built equals the transaction signed". The only thing that can guarantee it is if original bytes are signed! And this CIP is actually promoting the opposite direction.


**Script Hash Consistency**: A developer publishes a reference script on-chain, then references it from their off-chain code. Currently, locally computed script hashes may not match the on-chain version due to encoding differences. Canonical serialization guarantees hash consistency across compilation and deployment pipelines.

**Library Maintainers**: Serialization library authors currently must support multiple encoding strategies for compatibility. With a standard, they can focus on a single canonical implementation, reducing maintenance burden and improving deserialization reliability.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, serialization libraries will have to support both non-canonical for historical data and canonical CBOR for new data. So, IMHO serialization library authors will be impacted negatively by this CIP


## Goals

<!-- A list of goals and non-goals a project is pursuing, ranked by importance. These goals should help understand the design space for the solution and what the underlying project is ultimately trying to achieve.
Goals may also contain requirements for the project. For example, they may include anything from a deadline to a budget (in terms of complexity or time) to security concerns.
Finally, goals may also serve as evaluation metrics to assess how good a proposed solution is. -->

### Primary Goals

1. **Establish a canonical CBOR standard**: A CIP that specifies deterministic encoding rules for all Cardano transaction and on-chain data structures, with clear guiding principles for choosing between encoding alternatives.

2. **Achieve ecosystem adoption**: Widespread implementation across major serialization libraries (cardano-serialization-lib, pycardano, Lucid, Aiken, etc.) and wallets (Nami, Eternl, Lace, Yoroi, etc.), ensuring cross-tool interoperability.

3. **Provide implementation guidance**: Comprehensive documentation including test vectors, reference implementations or validation tools, and migration paths for existing tooling.

### Optional Goals

4. **Ledger-level enforcement**: If community consensus supports it, implement validation rules in the ledger to guarantee compliance (requires hardfork and backward compatibility strategy).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deserialization in Ledger is not part of the Ledger rules. It is a totally separate stage, when compared to transaction validation.

Suggested change
4. **Ledger-level enforcement**: If community consensus supports it, implement validation rules in the ledger to guarantee compliance (requires hardfork and backward compatibility strategy).
4. **Ledger-level enforcement**: If community consensus supports it, canonical deserilization must be correctly implemented in ledger to guarantee compliance (requires hardfork, backward compatibility and forward migration strategies).


### Success Criteria

This CPS is successfully resolved when:

- A canonical CBOR serialization CIP reaches "Active" status with clear specifications
- At least 80% of major libraries and wallets demonstrate compliance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to quantify major libraries today?
Also, there is no mention of Ledger actually have it implemented:

Suggested change
- At least 80% of major libraries and wallets demonstrate compliance
- At least 80% of major libraries and wallets demonstrate compliance
- cardano-node has a hard fork ready foe a new Ledger era that changes its deserializers to canonical CBOR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, indeed this CPS is not intended to affect ledger

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, there is a section about it:

Should the standard be enforced at the ledger level?

As I already pointed out, I would not want this CPS to affect Ledger either. I am a bit skeptical about standards that aren't enforced by the chain itself, but there are standards like these that proved themselves to work. So, at the very least, I would suggest adding all the drawbacks that I've mentioned to the section that suggests that this "standard should be enforced by the ledger level"

- Serialization-related issues in community support channels decrease measurably
- Cross-tool transaction building becomes reliably predictable

### Requirements for Solutions

A good solution must:

- **Technical clarity**: Unambiguous encoding rules for all covered data structures
- **Guiding principles**: Clear rationale for choosing specific encodings (efficiency, simplicity, adoption)
- **Comprehensive scope**: Address transactions, scripts, datums, redeemers, and specify what is out-of-scope
- **Path to Active**: Detailed adoption strategy including timeline, stakeholder coordination, and migration tooling
- **Evolution mechanism**: Process for handling future hardforks and new ledger types
- **Verification**: Test vectors or validation tools to verify implementation compliance

## Open Questions

### What are the guiding principles for choosing the canonical form?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a custom canonical CBOR standard is to be designed then there is much higher chance that it will be implemented correctly by all of the tools. Grabbing an existing standard and accepting any potential drawback it could have (eg. non-optimal size) would be a safer bet IMHO.


When multiple valid CBOR encodings exist, how should we decide which becomes canonical?

- **Efficiency**: Minimize transaction size (e.g., smallest integer encoding, definite over indefinite length)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definite and indefinite length encoding are have more efficient depending on number of elements. When there are less than 23 elements in an array definite length encoding is more efficient, while large count benefits from indefinite length encoding.
So, canonical will make encoding less efficient.

- **Simplicity**: Choose the most straightforward encoding to implement and verify
- **Existing adoption**: Align with the most widely-used pattern in current tooling (e.g., cardano-serialization-lib as de facto standard)
- **Trade-offs**: How should we balance these potentially conflicting dimensions?

### Should the standard be enforced at the ledger level?

**Enforcing** (ledger validation):

- Pros: Guarantees compliance; eliminates ambiguity; strongest interoperability guarantee
- Cons: Breaks backward compatibility; requires hard fork; existing tools and transactions may become invalid; potentially impractical migration burden

**Not enforcing** (off-chain standard only):

- Pros: No backward compatibility concerns; existing transactions remain valid; easier initial adoption
- Cons: Voluntary compliance may be insufficient; fragmentation may persist; no guarantee of universal adoption

### How should canonical serialization evolve with hardforks?

When a hardfork introduces new ledger types or transaction fields, the CBOR encoding for these new structures must be decided. This raises critical workflow questions:

**Pre-hardfork standardization**: Should the canonical encoding for new ledger types be specified as part of the hardfork proposal itself? This would prevent fragmentation but may slow down hardfork timelines.

**Implementation sequencing**: Should serialization libraries wait for a canonical standard to be ratified before implementing support for new ledger types? Or should they implement independently and risk creating incompatible encodings?

**Governance and responsibility**: Who should define the canonical encoding for new types?

- The team proposing the hardfork (e.g., IOG, Intersect ledger team)?
- CIP editors through a formal proposal process?
- Library maintainers through community consensus?
- A designated standardization working group?

## Copyright

This CPS is licensed under [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/legalcode).