Skip to content

Commit 3806057

Browse files
authored
Merge pull request #65 from multiformats/feat/base36_spec
Base36 byte-encoding specification
2 parents 54f897a + f378d34 commit 3806057

File tree

9 files changed

+78
-20
lines changed

9 files changed

+78
-20
lines changed

Diff for: README.md

+15-11
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
> Self identifying base encodings
99
1010
Multibase is a protocol for disambiguating the encoding of base-encoded (e.g.,
11-
base32, base64, base58, etc.) binary appearing in text.
11+
base32, base36, base64, base58, etc.) binary appearing in text.
1212

1313
When text is encoded as bytes, we can usually use a one-size-fits-all encoding
1414
(UTF-8) because we're always encoding to the same set of 256 bytes (+/- the NUL
@@ -63,17 +63,19 @@ base8, 7, octal,
6363
base10, 9, decimal, draft
6464
base16, f, hexadecimal, default
6565
base16upper, F, hexadecimal, default
66-
base32hex, v, rfc4648 no padding - highest char, candidate
67-
base32hexupper, V, rfc4648 no padding - highest char, candidate
68-
base32hexpad, t, rfc4648 with padding, candidate
69-
base32hexpadupper, T, rfc4648 with padding, candidate
70-
base32, b, rfc4648 no padding, default
71-
base32upper, B, rfc4648 no padding, default
72-
base32pad, c, rfc4648 with padding, candidate
73-
base32padupper, C, rfc4648 with padding, candidate
66+
base32hex, v, rfc4648 case-insensitive - no padding - highest char, candidate
67+
base32hexupper, V, rfc4648 case-insensitive - no padding - highest char, candidate
68+
base32hexpad, t, rfc4648 case-insensitive - with padding, candidate
69+
base32hexpadupper, T, rfc4648 case-insensitive - with padding, candidate
70+
base32, b, rfc4648 case-insensitive - no padding, default
71+
base32upper, B, rfc4648 case-insensitive - no padding, default
72+
base32pad, c, rfc4648 case-insensitive - with padding, candidate
73+
base32padupper, C, rfc4648 case-insensitive - with padding, candidate
7474
base32z, h, z-base-32 (used by Tahoe-LAFS), draft
75-
base58flickr, Z, base58 flicker, candidate
75+
base36, k, base36 [0-9a-z] case-insensitive - no padding, draft
76+
base36upper, K, base36 [0-9a-z] case-insensitive - no padding, draft
7677
base58btc, z, base58 bitcoin, default
78+
base58flickr, Z, base58 flicker, candidate
7779
base64, m, rfc4648 no padding, default
7880
base64pad, M, rfc4648 with padding - MIME encoding, candidate
7981
base64url, u, rfc4648 no padding, default
@@ -107,6 +109,7 @@ Consider the following encodings of the same binary string:
107109
```
108110
4D756C74696261736520697320617765736F6D6521205C6F2F # base16 (hex)
109111
JV2WY5DJMJQXGZJANFZSAYLXMVZW63LFEEQFY3ZP # base32
112+
3IY8QKL64VUGCX009XWUHKF6GBBTS3TVRXFRA5R # base36
110113
YAjKoNbau5KiqmHPmSxYCvn66dA1vLmwbt # base58
111114
TXVsdGliYXNlIGlzIGF3ZXNvbWUhIFxvLw== # base64
112115
```
@@ -116,11 +119,12 @@ And consider the same encodings with their multibase prefix
116119
```
117120
F4D756C74696261736520697320617765736F6D6521205C6F2F # base16 F
118121
BJV2WY5DJMJQXGZJANFZSAYLXMVZW63LFEEQFY3ZP # base32 B
122+
K3IY8QKL64VUGCX009XWUHKF6GBBTS3TVRXFRA5R # base36 K
119123
zYAjKoNbau5KiqmHPmSxYCvn66dA1vLmwbt # base58 z
120124
MTXVsdGliYXNlIGlzIGF3ZXNvbWUhIFxvLw== # base64 M
121125
```
122126

123-
The base prefixes used are: `F, B, z, M`.
127+
The base prefixes used are: `F, B, K, z, M`.
124128

125129

126130
## FAQ

Diff for: multibase.csv

+11-9
Original file line numberDiff line numberDiff line change
@@ -5,17 +5,19 @@ base8, 7, octal,
55
base10, 9, decimal, draft
66
base16, f, hexadecimal, default
77
base16upper, F, hexadecimal, default
8-
base32hex, v, rfc4648 no padding - highest char, candidate
9-
base32hexupper, V, rfc4648 no padding - highest char, candidate
10-
base32hexpad, t, rfc4648 with padding, candidate
11-
base32hexpadupper, T, rfc4648 with padding, candidate
12-
base32, b, rfc4648 no padding, default
13-
base32upper, B, rfc4648 no padding, default
14-
base32pad, c, rfc4648 with padding, candidate
15-
base32padupper, C, rfc4648 with padding, candidate
8+
base32hex, v, rfc4648 case-insensitive - no padding - highest char, candidate
9+
base32hexupper, V, rfc4648 case-insensitive - no padding - highest char, candidate
10+
base32hexpad, t, rfc4648 case-insensitive - with padding, candidate
11+
base32hexpadupper, T, rfc4648 case-insensitive - with padding, candidate
12+
base32, b, rfc4648 case-insensitive - no padding, default
13+
base32upper, B, rfc4648 case-insensitive - no padding, default
14+
base32pad, c, rfc4648 case-insensitive - with padding, candidate
15+
base32padupper, C, rfc4648 case-insensitive - with padding, candidate
1616
base32z, h, z-base-32 (used by Tahoe-LAFS), draft
17-
base58flickr, Z, base58 flicker, candidate
17+
base36, k, base36 [0-9a-z] case-insensitive - no padding, draft
18+
base36upper, K, base36 [0-9a-z] case-insensitive - no padding, draft
1819
base58btc, z, base58 bitcoin, default
20+
base58flickr, Z, base58 flicker, candidate
1921
base64, m, rfc4648 no padding, default
2022
base64pad, M, rfc4648 with padding - MIME encoding, candidate
2123
base64url, u, rfc4648 no padding, default

Diff for: rfcs/Base36.md

+40
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# Base36
2+
3+
The multibase base36 prefix is the character `k` or `K`. The digit-alphabet
4+
consists of 0..9 and then the case insensitive range a..z for the values 10..35
5+
6+
## Encoding
7+
8+
A byte array is encoded to base36 by:
9+
10+
1. Counting the number of leading 0 bytes (Z).
11+
2. Interpreting the rest of the byte array as a big-endian unsigned integer (N).
12+
3. Concatenating a length Z string of '0' characters with the base36
13+
representation of N.
14+
15+
A byte array is encoded to multibase base36 by prefixing its base36 encoding
16+
with the character `k`.
17+
18+
## Decoding
19+
20+
A multibase base36 encoded string is decoded by first dropping the multibase
21+
prefix (which must be `k` or `K`).
22+
23+
The remaining characters are then converted to a byte array by:
24+
25+
1. Counting the number of leading '0' characters (Z).
26+
2. Interpreting the rest of the character sequence as a base36 unsigned integer
27+
(N).
28+
3. Concatenating a length Z array of NULL (0x00) bytes with N encoded as a
29+
big-endian unsigned integer.
30+
31+
## Examples
32+
33+
Byte Array <-> Base36 Multibase:
34+
35+
| Bytes | == | LC Base36 | OR | UC Base36 |
36+
|---|---|---|---|---|
37+
| `[0x00, 0x01]` | == | `"k01"` | | `"K01"` |
38+
| `[0x00, 0x00, 0xff]` | == | `"k0073"` | | `"K0073"` |
39+
| `[0x01, 0x00]` | == | `"k74"` | | `"K74"` |
40+
| `[0x00, 0x01, 0x00]` | == | `"k074"` | | `"K074"` |

Diff for: tests/test1.csv

+2
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@ base32padupper, "CIRSWGZLOORZGC3DJPJSSAZLWMVZHS5DINFXGOIJB"
1313
base32hexpad, "t8him6pbeehp62r39f9ii0pbmclp7it38d5n6e891"
1414
base32hexpadupper, "T8HIM6PBEEHP62R39F9II0PBMCLP7IT38D5N6E891"
1515
base32z, "het1sg3mqqt3gn5djxj11y3msci3817depfzgqejb"
16+
base36, "k343ixo7d49hqj1ium15pgy1wzww5fxrid21td7l"
17+
base36upper, "K343IXO7D49HQJ1IUM15PGY1WZWW5FXRID21TD7L"
1618
base58flickr, "Ztwe7gVTeK8wswS1gf8hrgAua9fcw9reboD"
1719
base58btc, "zUXE7GvtEk8XTXs1GF8HSGbVA9FCX9SEBPe"
1820
base64, "mRGVjZW50cmFsaXplIGV2ZXJ5dGhpbmchIQ"

Diff for: tests/test2.csv

+2
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@ base32padupper, "CPFSXGIDNMFXGSIBB"
1313
base32hexpad, "tf5in683dc5n6i811"
1414
base32hexpadupper, "TF5IN683DC5N6I811"
1515
base32z, "hxf1zgedpcfzg1ebb"
16+
base36, "k2lcpzo5yikidynfl"
17+
base36upper, "K2LCPZO5YIKIDYNFL"
1618
base58flickr, "Z7Pznk19XTTzBtx"
1719
base58btc, "z7paNL19xttacUY"
1820
base64, "meWVzIG1hbmkgIQ"

Diff for: tests/test3.csv

+2
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@ base32padupper, "CNBSWY3DPEB3W64TMMQ======"
1313
base32hexpad, "td1imor3f41rmusjccg======"
1414
base32hexpadupper, "TD1IMOR3F41RMUSJCCG======"
1515
base32z, "hpb1sa5dxrb5s6hucco"
16+
base36, "kfuvrsivvnfrbjwajo"
17+
base36upper, "KFUVRSIVVNFRBJWAJO"
1618
base58flickr, "ZrTu1dk6cWsRYjYu"
1719
base58btc, "zStV1DL6CwTryKyV"
1820
base64, "maGVsbG8gd29ybGQ"

Diff for: tests/test4.csv

+2
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@ base32padupper, "CAB4WK4ZANVQW42JAEE======"
1313
base32hexpad, "t01smasp0dlgmsq9044======"
1414
base32hexpadupper, "T01SMASP0DLGMSQ9044======"
1515
base32z, "hybhskh3ypiosh4jyrr"
16+
base36, "k02lcpzo5yikidynfl"
17+
base36upper, "K02LCPZO5YIKIDYNFL"
1618
base58flickr, "Z17Pznk19XTTzBtx"
1719
base58btc, "z17paNL19xttacUY"
1820
base64, "mAHllcyBtYW5pICE"

Diff for: tests/test5.csv

+2
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@ base32padupper, "CAAAHSZLTEBWWC3TJEAQQ===="
1313
base32hexpad, "t0007ipbj41mm2rj940gg===="
1414
base32hexpadupper, "T0007IPBJ41MM2RJ940GG===="
1515
base32z, "hyyy813murbssn5ujryoo"
16+
base36, "k002lcpzo5yikidynfl"
17+
base36upper, "K002LCPZO5YIKIDYNFL"
1618
base58flickr, "Z117Pznk19XTTzBtx"
1719
base58btc, "z117paNL19xttacUY"
1820
base64, "mAAB5ZXMgbWFuaSAh"

Diff for: tests/test6.csv

+2
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,5 @@ base32pad, "cnbswy3dpeB3W64TMMQ======"
99
base32padupper, "Cnbswy3dpeB3W64TMMQ======"
1010
base32hexpad, "td1imor3f41RMUSJCCG======"
1111
base32hexpadupper, "Td1imor3f41RMUSJCCG======"
12+
base36, "kfUvrsIvVnfRbjWaJo"
13+
base36upper, "KfUVrSIVVnFRbJWAJo"

0 commit comments

Comments
 (0)