From 7c7ddae9f5dbca13fbe3d5abeb1e21d7d7e03e4d Mon Sep 17 00:00:00 2001 From: Daira-Emma Hopwood Date: Mon, 25 Aug 2025 19:23:24 +0100 Subject: [PATCH] [ZIP 1]: Initial work-in-progress draft. Co-Authored-By: Daira-Emma Hopwood Co-Authored-By: Jack Grigg Co-Authored-By: Arya --- zips/zip-0001.md | 265 ++++++++++++++++++++++++++++++++++++++++++++++ zips/zip-0001.rst | 7 -- 2 files changed, 265 insertions(+), 7 deletions(-) create mode 100644 zips/zip-0001.md delete mode 100644 zips/zip-0001.rst diff --git a/zips/zip-0001.md b/zips/zip-0001.md new file mode 100644 index 000000000..6ab0a003e --- /dev/null +++ b/zips/zip-0001.md @@ -0,0 +1,265 @@ + + ZIP: 1 + Title: Network Upgrade Policy and Scheduling + Owners: Kris Nuttycombe + Jack Grigg + Daira-Emma Hopwood + Status: Draft + Category: Process + Created: 2025-08-25 + License: MIT + Discussions-To: + + +# Terminology + +The key words "MUST", "REQUIRED", "MUST NOT", "SHOULD", and "MAY" in this +document are to be interpreted as described in BCP 14[^BCP14] when, and +only when, they appear in all capitals. + +The terms "Mainnet" and "Testnet" in this document are to be interpreted as defined in the Zcash protocol specification[^protocol-networks]. + +The terms "Network Upgrade" and "Activation Height" are to be interpreted as defined in ZIP 200[^zip-0200]. + +The term "End-of-Service Halt Interval" refers to the number of blocks between the approximate release height of a version of the Zcash node software and the block height at which nodes of that version will refuse to operate. + +# Abstract + +This document describes the process of preparing and executing a Zcash Network Upgrade, and the constraints on node implementations that affect network upgrade timelines. + +# Motivation + +The Zcash network performs hard-forking upgrades following the full node coordination mechanism specified in ZIP 200[^zip-0200] when changes are made to the consensus rules. In order for the network to continue operating without interruption at network upgrade boundaries, node implementers must coordinate in order to ensure that the nodes that are active on the network at the time of such an upgrade are all running compatible software. + +Various documents have been written in the past that attempted to provide frameworks or documentation for the network upgrade process at an "engineers working on the Zcash protocol" level: + +- [Network Upgrade Pipeline 1.0](https://electriccoin.co/blog/the-zcash-network-upgrade-pipeline/) +- Network Upgrade Pipeline 1.1 +- [Network Upgrade Pipeline 1.2](https://docs.google.com/drawings/d/1WAvIkVBv_fC4L4wDoAJaMTYVh3dJbwhR5YuP5HgOjFw/edit) + - This was followed for NU3 and NU4. +- [Network Upgrade Pipeline 2.0](https://electriccoin.co/blog/network-upgrade-pipeline-2-0/) + - This was followed for NU5 and (to some extent) NU6. +- [Deadlines for ZIPs that want to be activated in NU7](https://forum.zcashcommunity.com/t/important-deadline-for-zips-that-want-to-be-activated-in-nu7/48998) + +These past attempts have suffered from several issues: + +- The NUPs were designed and used at a time when there was either one single feature being deployed in a network upgrade, or one single entity (usually the Electric Coin Company) developing all the NU's features. As such, and despite best efforts, they were not well-suited for parallelization and they did not accurately specify all of the constraints on timing. +- The NUPs were generally too optimistic about timing, especially for complicated upgrades. + +By following the steps specified by this ZIP, and ensuring that the required timelines are adhered to by node implementers, the Zcash network can maintain uninterrupted continuity of service. + + +# Requirements + +Engineers working towards a Zcash network upgrade should not miss any important steps if they read this ZIP. + +# Non-requirements + +This ZIP does not cover any governance issues, and it does not define any process for how candidate ZIPs should be selected for inclusion in a network upgrade beyond technical constraints. + +# Specification + +TODO: +- How network protocol versions are chosen. +- Audit requirements for upgrades. +- Specification requirements: + - version group ID if needed; + - protocol spec changes; + - differences between Testnet and Mainnet; + - peer protocol versions. +- What happens if there is a security issue or other problem that necessitates a change to the NU-supporting release? + - before Testnet activation; + - after Testnet activation & before release of node software that sets the Mainnet activation height; + - after the release of node s/w that sets the Mainnet activation height & before Mainnet activation; + - after Mainnet activation. + +## Constraints on the timing of a Network Upgrade + +The hardest and most restrictive preconditions for a Network Upgrade occur near to its activation. As such, it is easier to initially work backwards from a desired "active on Mainnet" state to figure out what the minimum intervals are between the various stages of NU development and deployment. + +Testnet and Mainnet activation heights for a Network Upgrade MUST be set in such a way that satisfies the constraints described below. + +### End-Of-Service Halts + +Zcash network nodes MUST implement an End-of-Service Halt mechanism, whereby nodes that reach the End-of-Service height on Mainnet automatically shut down and refuse to operate if they detect that the current block height of the network is greater than their End-of-Service height. + +Nodes SHOULD reach their End-of-Service halt **129024 blocks** (the "End-of-Service Halt Interval", approximately 16 weeks) after their approximate release height. The approximate release height is defined as the current block height at the time that the release tag is published, $\pm$ 500 blocks. + +New versions of each node implementation SHOULD be released approximately every **48384 blocks** (approximately 6 weeks.) + +All nodes that do not support the network upgrade MUST have reached their End-of-Service halt height at least **8064 blocks** (approximately one week) prior to the network upgrade's activation height, and SHOULD have reached their End-of-Service halt at least **16128 blocks** prior to the activation height. This ensures that entities who do not upgrade their node software until the End-of-Service halt occurs have time to upgrade before the activation height is reached. + +
+ + +#### Rationale + + +For the health and stability of the network, it is critical that at the block height at which a network upgrade activates, no unintended chain fork should occur. In order to ensure that this happens, all nodes of the network (but in particular nodes that perform block construction) must correctly implement and enforce the new consensus rules that become active in the activation block. + +The End-of-Service halt mechanism has been implemented in both `zcashd` and `zebrad` to ensure that incautious users of the node software do not accidentally cause a chain fork by running the wrong consensus rules against the block at the activation height and subsequent blocks. This protection supplements the protections provided by the network using a new consensus branch ID (as specified in ZIP 200 [^zip-0200]), and the nodes of the network communicating using a new peer protocol version identifier for network messages. + +The End-of-Service Halt Interval is chosen to be approximately 16 weeks to strike a balance between the frequency with which node operators are required to upgrade their software (which is desired to be as infrequent as possible), and the frequency with which it is possible to make network upgrades (which is desired to be as frequent as possible.) The End-of-Service Halt Interval sets a lower bound on the frequency of network upgrades. + +The approximate 6-week release interval is intended to ensure that there are always at least two working versions of a given node implementation in service at a given time. This allows for a 10-week window after the release of a new version of the node software during which a critical bug in such a release can be mitigated simply by running the previous version of the node software. This also means that after any release, entities in the community have a 10-week window in which to upgrade, and an entity that chooses to only upgrade to every other release has a 4-week window in which to upgrade. +
+ +### Mainnet activation window + +The Mainnet activation height of a network upgrade MUST be set at least **10 weeks** prior to its expected activation date (assuming a consistent block production rate). This height SHOULD be decided by unanimous agreement among the various full node developers, to ensure that all full node implementations are able to release compatible software updates with sufficient upgrade time. + +A node implementation that supports the network upgrade MUST be released at least **4 weeks** prior to that date, and SHOULD be released at the same time that the activation height is set. + +These timings depend on the current 16-week End-of-Service Halt Interval implemented by zebrad and zcashd; they should be recalculated if the full node End-of-Service policy changes. + +
+ + +#### Rationale + + +It takes time for people to deploy software updates, especially when they need to be coordinated between multiple organizations. + +For Zcash Network Upgrades in particular, deploying a full node update is not just upgrading the `zebrad` dependency in a stack: it can also mean making changes to internal transaction parser libraries or other private codebases. As such, network upgrades need to ensure sufficient time between the consensus rules being finalized, and the consensus rules activating on the network. Per ZIP 200, the point at which a set of consensus rules (defined by a particular `CONSENSUS_BRANCH_ID`) is finalized is when an activation height is defined. + +The 10 week minimum window comes from two places: + +- The `zcashd` release flow and End-of-Service policy in early Zcash years: a release every 6 weeks and an End-of-Service policy of 16 weeks meant that there was at least 10 weeks between the release of version $X$ and the End-of-Service of version $X-1$. +- 10 weeks is a little over 2 months. In practice network upgrades have been less smooth if they had fewer than 3 months between setting the Mainnet activation height (and releasing compatible node software), and activation of the NU. +
+ +### Audits + +TODO + +Implementations of network upgrades SHOULD be audited and SHOULD address any critical issues reported by auditors before setting a Mainnet activation height. + +#### Auditor selection + +#### Specification auditing + +#### Implementation auditing + +
+ + +#### Rationale + + +Auditors generally impose several requirements. +
+ +## Timeline + + +### Specification + +Specification of the consensus rules to be activated by a network upgrade MUST be complete prior to third-party audits of consensus node software implementing those specifications can commence. In the process of specification, the consensus branch ID for the network upgrade MUST be chosen in conformance with ZIP 200[^zip-0200]. + +#### Deployment ZIP + +A Deployment ZIP for the network upgrade MUST be created that specifies the ZIPs that collectively define the consensus rules that will hold once network upgrade activates, and the consensus branch ID that corresponds to that set of rules. This deployment ZIP may only transition to `Proposed` status once all of the ZIPs it deploys have transitioned to `Proposed`, in accordance with ZIP 0[^zip-0000]. + +- TODO: when are global constants relevant to a network upgrade (apart from the consensus branch ID) set, specifically with respect to parameters that do not affect the operation of the consensus rules as opposed to which consensus rules are enforced. + +The consensus rules must be stable prior to the specifications transitioning to Proposed +- TODO: define "stable". +- TODO: This is (or should be) duplicative of the definition of Proposed in ZIP 0; once [clarified](https://github.com/zcash/zips/issues/1078) we should reference that instead. + +
+ + +#### ZIP status transitions + + +The ZIP 0 [^zip-0000] process that describes the lifecycle for Consensus ZIPs being included in a network upgrade is roughly: + +- Transition individual ZIPs to Proposed as they stabilise +- Transition individual ZIPs to Implemented (as relevant) +- Transition the deployment ZIP to Proposed (once NU contents is fixed and thus consensus branch ID meaning is stable) along with any remaining individual ZIPs +- Transition remaining ZIPs to Implemented (as relevant) +- Set testnet activation height (and amend deployment ZIP) +- Activate on testnet and do testing + - If a bug is found, remove activation heights and change consensus branch ID +- Roughly simultaneously, set mainnet activation height and release implementations with mainnet activation height set +- Wait for activation +- Transition all ZIPs to Final + +
+ +### Implementation + +Implementation and specification work may proceed in parallel, insofar as is feasible. + +### Auditing + +Third-party audits of consensus node software MUST be complete prior to setting the mainnet activation height for the network upgrade. + +### Testing + +Node software implementing the consensus changes for a network upgrade MUST activate successfully on the Zcash test network prior to the mainnet activation height being set. Such network software SHOULD activate and be thoroughly tested on a private fork of the Zcash test network prior to the testnet activation height being set. + +
+ + +#### How to run Private Test Networks + + +A private test network can be set up in the following manner: + +- Select a new random consensus branch ID `private_branch_id`. +- Select an activation height `h` that is a short period of time in the future of the current testnet. + - Testnet often runs at double the speed of mainnet; plan for that when deciding how long it will take to complete the rest of the pre-activation setup. +- Create a private fork of the node software and librustzcash that: + - Changes the consensus branch ID to `private_branch_id`. + - Sets the testnet activation height to `h`. + - Bumps the P2P network protocol version to the testnet protocol version. + - Changes the client subversion name (useful for debugging). +- Prepare several machines / operators to participate in the test. + - At least two machines should have sufficient CPU power to participate in mining. +- Ensure that all machines will be able to connect to at least one other machine within the private test group. There are two options that can work here: + - Have at least one machine running with a fixed public IP address (e.g. on a VPS). + - Use Tailscale to enable direct connections between machines alongside the regular test network: + - Install tailscale on each machine, logging into either a common shared account, or individual accounts. + - If using individual accounts, share at least one machine within Tailscale to the other Tailscale accounts. + - Ensure that each shared machine has ACLs configured to permit access on the port that the node software is listening on. +- Build and run the private testnet node software. + - If the above has taken too long and the activation height has already passed, go back to the start and select a new `private_branch_id` and activation height. + - The private testnet nodes should connect to public Testnet nodes, in addition to other private testnet nodes. +- Wait for activation. + +What should happen is that the private testnet nodes: +- reject block `h` from Testnet. +- get stuck at block `h - 1` waiting for the network upgrade to activate. +- disconnect from any connected public Testnet nodes (either in advance depending on the node software, or immediately after block `h` is mined on Testnet and is rejected by the private testnet nodes). + +At this point: +- Start mining +- Exercise the new consensus rules via wallet software connected to the private testnet. [zcash-devtool](https://github.com/zcash/zcash-devtool) is good for this. + +See the [Private Testnet Testing](https://zebra.zfnd.org/dev/private-testnet.html?highlight=private#private-testnet-test) section of the Zebra book for how to set up Zebra nodes in a private Testnet. +
+ +#### Test Network Activation + +A release of the node software that supports the Network Upgrade on Testnet, but not Mainnet MUST be made at least **16128 blocks** (approximately two weeks) prior to the testnet activation height. Such a release MUST set its end-of-service halt height such that nodes having this release version reach their end-of-service halt prior to mainnet activation, respecting the constraints described in [End-Of-Service Halts] above. Such a testnet release MUST correctly implement the preexisting mainnet consensus rules, and be fully usable as a mainnet consensus and/or mining node. + +- TODO: are there constraints on release heights of testnet-supporting versions related to avoiding testnet chain forks? IIRC we haven't required that existing releases EOS-halt before the testnet activation height, because we don't set testnet EOS halt heights? +- ^ There are checkpoints that would be updated in the second release that supports the network upgrade on Testnet, but there's no EoS halt on Testnet. + +### Partner adoption + +# References + +[^BCP14]: [Information on BCP 14 — "RFC 2119: Key words for use in RFCs to Indicate Requirement Levels" and "RFC 8174: Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words"](https://www.rfc-editor.org/info/bcp14) + +[^protocol]: [Zcash Protocol Specification, Version 2024.5.1 or later](protocol/protocol.pdf) + +[^protocol-networks]: [Zcash Protocol Specification, Version 2024.5.1. Section 3.12: Mainnet and Testnet](protocol/protocol.pdf#networks) + +[^zip-0000]: [ZIP 0: ZIP Process](zip-0000.rst) + +[^zip-0200]: [ZIP 200: Network Upgrade Mechanism](zip-0200.rst) + +[^zip-0201]: [ZIP 201: Network Peer Management for Overwinter](zip-0201.rst) + diff --git a/zips/zip-0001.rst b/zips/zip-0001.rst deleted file mode 100644 index 35aec9ee6..000000000 --- a/zips/zip-0001.rst +++ /dev/null @@ -1,7 +0,0 @@ -:: - - ZIP: 1 - Title: Network Upgrade Policy and Scheduling - Status: Reserved - Category: Consensus Process - Discussions-To: