-
Notifications
You must be signed in to change notification settings - Fork 9
Leios design: Write introduction and overview chapter #597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
docs/leios-design/README.md
Outdated
| > TODO: (re-)introduce the main protocol flow of Leios? | ||
| As was the case for the [Praos variant of Ouroboros](https://ouroboros-network.cardano.intersectmbo.org/pdfs/network-design/network-design.pdf#subsection.5.1), the specification embodied in the published and peer-reviewed [research paper for Ouroboros Leios](https://eprint.iacr.org/2025/1115.pdf) was not intended to be directly implementable. Initial research and development studies confirmed this expectation, identifying several unsolved problems with the fully concurrent block production design when considering the concrete Cardano ledger and what consequences this would have (TODO: cite suitable R&D reports, [Tech Report #2](https://github.com/input-output-hk/ouroboros-leios/blob/main/docs/technical-report-2.md#conflicts-ledger-and-incentives)). | ||
|
|
||
| The design presented in [CIP-164](https://github.com/cardano-scaling/CIPs/blob/leios/CIP-0164/README.md), also known as "Linear Leios", addresses these implementation challenges by focusing on the core insight of better utilizing network and computational resources during the necessary and eponymous "calm periods" of the Praos protocol. This approach provides an immediately implementable design that can deliver orders of magnitude higher throughput while preserving the security guarantees that make Cardano valuable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linear Leios doesn't utilise the network any better than Praos. It uses bursts of traffic, as Praos does, but with even higher bursts than Praos does - from that perspective, it's strictly worse than Praos. So we need to be a bit more careful here, what about:
Although Linear Leios does not improve the burstiness of the Praos protocol, it will utilise unused bandwidth during long gaps between Praos blocks. For better utilisation of the underlying TCP protocol, a protocol which uses constant high pressure on the network is required (e.g. some form of full Leios).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linear Leios doesn't utilise the network any better than Praos. It uses bursts of traffic, as Praos does, but with even higher bursts than Praos does - from that perspective, it's strictly worse than Praos
Hm, interesting take that Leios would have higher bursts than Praos. I'd like to challenge this viewpoint: In Leios we communicate five things: Ranking blocks which announce/certify EBs, votes that lead to certificates, transactions submitted by users, EBs, any "missing transactions".
A certifying ranking block (~10kB) is about 90% smaller than a full Praos block (90kB); if not certifying ranking block = praos block.
Voting on EBs could be considered bursts of network traffic (in the order of ~50kB per round), as they are created at the same time, however votes are going to be created distributed across the network and this should even it out further than the single source full block one would expect in Praos?
Which leaves us with transaction submission and diffusion of EBs + missing transactions: how much of the transaction submission is impacting network traffic depends on our understanding of the load scenario - whether traffic would be organic, artificial, coming from one or many sources, etc. - it is what it is; now, under any given high demand (otherwise we'd be using only praos), Leios will only result in the overhead of EBs (~40 bytes per tx) and re-submission of a subset of endorsed transactions (the ones which were not diffused originally during submission). This means, in the average case, this is minimal, and in the worst case it boils down to the same burstiness as Praos would have at this load!?
For better utilisation of the underlying TCP protocol, a protocol which uses constant high pressure on the network is required (e.g. some form of full Leios
Isn't this the case already due to transaction submission, no matter how consensus is achieved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the happy case, sure, but we should really be focused on improving the worst-case scenario, which, with linear Leios, can lead to 12MB of fresh data being downloaded over a short period, with the TCP window closed due to an idle period.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with designing for the worst case (that's why I wrote that section in this PR!). The worst case for Leios is not the same as the worst case for Praos though - is it? What is the worst case scenario for EB diffusion?
I see two situations where urgent fetching of big transaction closures is required:
- A block is (maliciously) produced that announces a full EB with completely unknown transactions
- A block is produced that certifies a full EB with completely unknown transactions
Case 1. is only possibly affecting high throughput and would only be problematic if a) most stake is acting like this and/or b) this can be caused by a network attacker (e.g. by eclipsing / partition the majority of the network).
Only 2. is on the critical path in Leios and needs to happen within worst case
If not, that worst case scenario is strictly less bad as you paint it. Am I missing something / is my thinking flawed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meanwhile I have refined the wording in this paragraph: d9c975d. Is this better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can also happen when the internet is partitioned (e.g. a major routing error). This is a genuine scenario which we've encountered at least once before.
would we only have a single upstream peer with a closed TCP window to reach that honest stake?
After a network partition, all connections to the other side will be new, so this will be true for all peers on different sides of the network partition.
In the partition we encountered, Praos recovered pretty well; Linear Leios should recover similarly if we prioritise Praos over Leios well enough.
Btw, my point wasn't that we're not safe, just that we're not utilising the network fundamentally better. I see this as an incentive, actually, to continue towards full Leios.
| A discrete event simulation implemented in Rust, models Leios message exchanges between nodes, abstracting lower-level details for speed—running orders of magnitude faster than real time to enable statistical analysis over thousands of runs with complete observability and arbitrary adversarial behavior injection. This validates security arguments by systematically exploring protocol behavior under varying loads, expected data diffusion in small to medium sized network topologies, or adversarial scenarios like data withholding, and exploration of protocol parameters before testnet deployment. | ||
|
|
||
| Another Haskell-based simulation using IOSim and the actual network framework used in the `cardano-node`. This reduces model-implementation divergence while enabling studies of the dynamic behavior and resource management in detail. While IOSim is used in the existing network and consensus layers through property-based testing, and extends naturally to Leios components, the simulator built from this was not able to scale to large networks. | ||
| Another Haskell-based simulation using IOSim and the actual network framework used in the `cardano-node`. This reduces model-implementation divergence while enabling studies of the dynamic behavior and resource management in detail. While IOSim is used in the existing network and consensus layers through property-based testing, and extends naturally to Leios components, the simulator built from this was not able to scale to large networks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to know what has prevented the simulation from scaling to a large network. Was it io-sim performance, or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0f7df84 to
5e93857
Compare
900273e to
73fd4b5
Compare
5e93857 to
c536295
Compare
ee87a8f to
d048a4a
Compare
2ef5fa4 to
08201a3
Compare
a54934b to
e5ac5c7
Compare
08201a3 to
0ac6a19
Compare
coot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
docs/leios-design/README.md
Outdated
|
|
||
| > [!WARNING] | ||
| > TODO: (re-)introduce the main protocol flow of Leios? | ||
| As was the case for the [Praos variant of Ouroboros](https://ouroboros-network.cardano.intersectmbo.org/pdfs/network-design/network-design.pdf#subsection.5.1), the specification embodied in the published and peer-reviewed [research paper for Ouroboros Leios](https://eprint.iacr.org/2025/1115.pdf) was not intended to be directly implementable. Initial research and development studies confirmed this expectation, identifying several unsolved problems with the fully concurrent block production design when considering the concrete Cardano ledger and what consequences this would have (TODO: cite suitable R&D reports, [Tech Report #2](https://github.com/input-output-hk/ouroboros-leios/blob/main/docs/technical-report-2.md#conflicts-ledger-and-incentives)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This paragraph is quite heavy. I think we should add at the end that further R&D is required for solving the challenges of full Leios, so we keep the door open for the future.
Co-authored-by: coot <[email protected]>
0ac6a19 to
7164be8
Compare
Adds a chapter that should set the context, what is important, and provide plenty links to past/related work on the network/consensus design of Cardano.
Tagged @coot @pagio and @nfrisby as possible reviewers - you are authors on some of the linked documents and maybe have an opinion here.