Skip to content

Commit 9a63e50

Browse files
authored
Merge pull request #209 from libp2p/rfc/0001-cid-peerid
RFC 0001: Text Peer Ids as CIDs
2 parents 762aa74 + b621ac5 commit 9a63e50

File tree

2 files changed

+119
-15
lines changed

2 files changed

+119
-15
lines changed

RFC/0001-text-peerid-cid.md

+67
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
- Start Date: 2019-08-15
2+
- Related issues: [go-ipfs/issues/5287](https://github.com/ipfs/go-ipfs/issues/5287), [multicodec/issues/130](https://github.com/multiformats/multicodec/issues/130), [go-libp2p-core/pull/41](https://github.com/libp2p/go-libp2p-core/pull/41)
3+
4+
# RFC 0001: Text Peer Ids as CIDs
5+
6+
## Abstract
7+
8+
This is an RFC to modify Peer Id spec to alter the default string representation
9+
from Multihash to CIDv1 in Base32 and to support encoding/decoding text Peer Ids as CIDs.
10+
11+
[ipld-cid-spec]: https://github.com/ipld/cid
12+
13+
## Motivation
14+
15+
1. Current text representation of Peer Id ([multihash][multihash] in [Base58btc][base58btc]) is case-sensitive.
16+
This means we can't use it in case-insensitive contexts such as domain names ([RFC1035][rfc1035] + [RFC1123][rfc1123]) or [FAT](fat) filesystems.
17+
2. [CID][ipld-cid-spec] provide [multibase][multibase] support and `base32`
18+
makes a [safe default][cidv1b32-move] that will work in case-insensitive contexts,
19+
enabling us to put Peer Ids [in domains][cid-in-subdomains] or create files with Peer Ids as names.
20+
3. It's much easier to upgrade wire protocols than text.
21+
This RFC makes Peer Ids in text form fully self describing, making them more future-proof.
22+
A dedicated [multicodec][multicodec] in text-encoded CID will indicate that [it's a hash of a libp2p public key][libp2p-key-multicodec].
23+
24+
[rfc1035]: http://tools.ietf.org/html/rfc1035
25+
[rfc1123]: https://tools.ietf.org/html/rfc1123
26+
[multibase]: https://github.com/multiformats/multibase/
27+
[multicodec]: https://github.com/multiformats/multicodec
28+
[multihash]: https://github.com/multiformats/multihash
29+
[cid-in-subdomains]: https://github.com/ipfs/in-web-browsers/issues/89
30+
[libp2p-key-multicodec]: https://github.com/multiformats/multicodec/issues/130
31+
[cidv1b32-move]: https://github.com/ipfs/ipfs/issues/337
32+
[base58btc]: https://en.bitcoinwiki.org/wiki/Base58#Alphabet_Base58
33+
[fat]: https://en.wikipedia.org/wiki/Design_of_the_FAT_file_system
34+
35+
## Detailed design
36+
37+
1. Switch text encoding and decoding of Peer Ids from Multihash to [CID][ipld-cid-spec].
38+
2. The new text representation should be CIDv1 with additional requirements:
39+
- MUST have [multicodec][multicodec] set to `libp2p-key` (`0x72`)
40+
- SHOULD have [multibase][multibase] set to `base32` (Base32 without padding, as specified by [RFC4648][rfc4648])
41+
42+
[rfc4648]: https://tools.ietf.org/html/rfc4648
43+
44+
### Upgrade path
45+
46+
1. Release support for reading Peer Id represented with CIDv1
47+
2. Wait three months or until the next release (whichever comes first)
48+
3. Switch the default Peer Id output format to CIDv1 in Base32
49+
50+
### Backward compatibility
51+
52+
The old text representation (Multihash encoded as [`base58btc`][base58btc])
53+
is a valid CIDv0 and does not require any special handling.
54+
55+
[base58btc]: https://en.bitcoinwiki.org/wiki/Base58#Alphabet_Base58
56+
57+
## Alternatives
58+
59+
We could just add a [multibase][multibase] prefix to multihash, but that requires more work and introduces a new format.
60+
This option was rejected as using CID enables reuse of existing serializers/deserializers and does not create any new standards.
61+
62+
## Unresolved questions
63+
64+
This RFC punts pids-as-cids on the wire down the road but that's something we can revisit if it ever becomes relevant.
65+
66+
[go-libp2p-core-41]: https://github.com/libp2p/go-libp2p-core/pull/41
67+
[libp2p-specs-111]: https://github.com/libp2p/specs/issues/111

peer-ids/peer-ids.md

+52-15
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,9 @@
22

33
| Lifecycle Stage | Maturity Level | Status | Latest Revision |
44
|-----------------|----------------|--------|-----------------|
5-
| 3A | Recommendation | Active | r0, 2019-05-23 |
5+
| 3A | Recommendation | Active | r1, 2019-08-15 |
66

7-
8-
**Authors**: [@mgoelzer], [@yusefnapora]
7+
**Authors**: [@mgoelzer], [@yusefnapora], [@lidel]
98

109
**Interest Group**: [@raulk], [@vyzo], [@Stebalien]
1110

@@ -14,6 +13,7 @@
1413
[@raulk]: https://github.com/raulk
1514
[@vyzo]: https://github.com/vyzo
1615
[@Stebalien]: https://github.com/Stebalien
16+
[@lidel]: https://github.com/lidel
1717

1818
See the [lifecycle document](../00-framework-01-spec-lifecycle.md) for context
1919
about maturity level and spec status.
@@ -53,7 +53,7 @@ Key encodings and message signing semantics are
5353

5454
## Keys
5555

56-
Our key pairs are wrapped in a [simple protobuf](https://github.com/libp2p/go-libp2p-crypto/blob/master/pb/crypto.proto),
56+
Our key pairs are wrapped in a [simple protobuf](https://github.com/libp2p/go-libp2p-crypto/blob/master/pb/crypto.proto),
5757
defined using the [Protobuf version 2 syntax](https://developers.google.com/protocol-buffers/docs/proto):
5858

5959
```protobuf
@@ -107,7 +107,7 @@ Here is the process by which we generate peer ids based on the public component
107107
3. Serialize the protobuf containing the public key into bytes using the [canonical protobuf encoding](https://developers.google.com/protocol-buffers/docs/encoding).
108108
4. If the length of the serialized bytes <= 42, then we compute the "identity" multihash of the serialized bytes. In other words, no hashing is performed, but the [multihash format is still followed](https://github.com/multiformats/multihash) (byte plus varint plus serialized bytes). The idea here is that if the serialized byte array is short enough, we can fit it in a multihash verbatim without having to condense it using a hash function.
109109
5. If the length is >42, then we hash it using it using the SHA256 multihash.
110-
110+
111111
### Note about deterministic encoding
112112

113113
Deterministic encoding of the `PublicKey` message is desirable, as it ensures
@@ -131,16 +131,54 @@ behavior.
131131

132132
### String representation
133133

134-
Peer Ids are multihashes, and they are often encoded into strings.
135-
The canonical string representation of a Peer Id is a base58 encoding with
136-
[the alphabet used by bitcoin](https://en.bitcoinwiki.org/wiki/Base58#Alphabet_Base58).
137-
This encoding is sometimes abbreviated as `base58btc`.
134+
Peer Ids are [multihashes][multihash] canonically represented with [CIDs](https://github.com/ipld/cid) when encoded into strings.
135+
136+
Encoding and decoding of string representation MUST follow [CID specification][cid-decoding].
137+
138+
Implementations parsing IDs from text MUST support both base58 CIDv0 and CIDv1 in base32, and they MUST generate base32-encoded CIDv1 by default. Generating CIDv0 is allowed as an opt-in (behind a flag).
139+
140+
CIDv0 is a multihash encoded in Base58.
141+
CIDv1 is a multihash with a prefix that specifies things like base encoding, cid version and the type of data behind it:
142+
143+
```
144+
<cidv1> ::= <multibase><cid-version><multicodec><multihash>
145+
```
146+
147+
#### libp2p-key CID
148+
149+
The canonical string representation of a Peer Id is a CID v1
150+
with `base32` [multibase][multibase] ([RFC4648](https://tools.ietf.org/html/rfc4648), without padding) and `libp2p-key` [multicodec][multicodec]:
138151

139-
An example of a `base58btc` encoded SHA256 peer id: `QmYyQSo1c1Ym7orWxLYvCrM2EmxFTANf8wXmmE7DWjhx5N`.
152+
| multibase | cid version | multicodec |
153+
| --------- | ----------- | ------------ |
154+
| `base32` | `1` | `libp2p-key` |
140155

141-
Note that some projects using libp2p will prefix "base encoded" strings with a
142-
[multibase](https://github.com/multiformats/multibase) code that identifies the encoding base and alphabet.
143-
Peer ids do not use multibase, and can be assumed to be encoded as `base58btc`.
156+
- `libp2p-key` multicodec is mandatory when serializing to text (ensures Peer Id is self-describing)
157+
- `base32` is the default multibase encoding: projects are free to use a different one if it is more suited to their needs
158+
159+
##### Decoding string representation
160+
161+
To decode a CID, follow the following algorithm:
162+
163+
- If it is 46 characters long and starts with `Qm...`, it's a CIDv0. Decode it as base58btc multihash.
164+
- Otherwise, decode it according to the multibase and [CID spec][cid-decoding].
165+
166+
167+
Examples:
168+
169+
- SHA256 Peer Id encoded as canonical [CIDv1][cid-versions]:
170+
`bafzbeie5745rpv2m6tjyuugywy4d5ewrqgqqhfnf445he3omzpjbx5xqxe` ([inspect](http://cid.ipfs.io/#bafzbeie5745rpv2m6tjyuugywy4d5ewrqgqqhfnf445he3omzpjbx5xqxe))
171+
- Peer Ids that do not start with a valid multibase prefix are assumed to be legacy [CIDv0][cid-versions]
172+
(a multihash with implicit [`base58btc`][base58btc] encoding, without any prefix).
173+
An example of the same Peer Id as a legacy CIDv0: `QmYyQSo1c1Ym7orWxLYvCrM2EmxFTANf8wXmmE7DWjhx5N`
174+
175+
176+
[multihash]: https://github.com/multiformats/multihash
177+
[multicodec]: https://github.com/multiformats/multicodec
178+
[multibase]: https://github.com/multiformats/multibase
179+
[base58btc]: https://en.bitcoinwiki.org/wiki/Base58#Alphabet_Base58
180+
[cid-decoding]: https://github.com/multiformats/cid#decoding-algorithm
181+
[cid-versions]: https://github.com/multiformats/cid#versions
144182

145183
## How Keys are Encoded and Messages Signed
146184

@@ -152,7 +190,7 @@ Four key types are supported:
152190

153191
Implementations MUST support RSA and Ed25519. Implementations MAY support Secp256k1 and ECDSA, but nodes using those keys may not be able to connect to all other nodes.
154192

155-
In all cases, implementation MAY allow the user to enable/disable specific key types via configuration.
193+
In all cases, implementation MAY allow the user to enable/disable specific key types via configuration.
156194
Note that disabling support for compulsory key types may hinder connectivity.
157195

158196
Keys are encoded into byte arrays and serialized into the `Data` field of the
@@ -204,4 +242,3 @@ We encode the public key using ASN.1 DER.
204242
We encode the private key using DER-encoded PKIX.
205243

206244
To sign a message, we hash the message with SHA 256, and then sign it with the [ECDSA standard algorithm](https://tools.ietf.org/html/rfc6979), then we encode it using [DER-encoded ASN.1.](https://wiki.openssl.org/index.php/DER)
207-

0 commit comments

Comments
 (0)