Skip to content

Commit 1766ab0

Browse files
authored
Merge pull request #72 from ipfs/rsrch-cid-as-struct-wrapped-str
cid implementation variations++
2 parents 5ddbe21 + 924534b commit 1766ab0

File tree

1 file changed

+25
-1
lines changed

1 file changed

+25
-1
lines changed

_rsrch/cidiface/README.md

+25-1
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ There are quite a few different ways to go:
2727
- Option C: CIDs as an interface with multiple implementors.
2828
- Option D: CIDs as a struct; multihash also as a struct or string.
2929
- Option E: CIDs as a struct; content as strings plus offsets.
30+
- Option F: CIDs as a struct wrapping only a string.
3031

3132
The current approach on the master branch is Option A.
3233

@@ -42,6 +43,10 @@ the binary format of the cid internally, and so could yield it again without
4243
malloc, while still potentially having faster access to components than
4344
Option B since it wouldn't need to re-parse varints to access later fields.
4445

46+
Option F is actually a varation of Option B; it's distinctive from the other
47+
struct options because it is proposing *literally* `struct{ x string }` as
48+
the type, with no additional fields for components nor offsets.
49+
4550
Option C is the avoid-choices choice, but note that interfaces are not free;
4651
since "minimize mallocs" is one of our major goals, we cannot use interfaces
4752
whimsically.
@@ -72,6 +77,7 @@ When using `*Cid`, the nil value is a clear sentinel for 'invalid';
7277
when using `type Cid string`, the zero value is a clear sentinel;
7378
when using `type Cid struct` per Option A or D... the only valid check is
7479
for a nil multihash field, since version=0 and codec=0 are both valid values.
80+
When using `type Cid struct{string}` per Option F, zero is a clear sentinel.
7581

7682
### usability as a map key is important
7783

@@ -82,6 +88,7 @@ We already covered this in the criteria section, but for clarity:
8288
- Option C: ~ (caveats, and depends on concrete impl)
8389
- Option D: ✔
8490
- Option E: ✔
91+
- Option F: ✔
8592

8693
### living without offsets requires parsing
8794

@@ -106,7 +113,7 @@ How much this overhead is significant is hard to say from microbenchmarking;
106113
it depends largely on usage patterns. If these traversals are a significant
107114
timesink, it would be an argument for Option D/E.
108115
If these traversals are *not* a significant timesink, we might be wiser
109-
to keep to Option B, because keeping a struct full of offsets will add several
116+
to keep to Option B/F, because keeping a struct full of offsets will add several
110117
words of memory usage per CID, and we keep a *lot* of CIDs.
111118

112119
### interfaces cause boxing which is a significant performance cost
@@ -129,6 +136,23 @@ if a situation is at the scale where it's become important to mind whether
129136
or not pointers are a performance impact, then that situation also
130137
is one where you have to think twice before using interfaces.
131138

139+
### struct wrappers can be used in place of typedefs with zero overhead
140+
141+
See `TestSizeOf`.
142+
143+
Using the `unsafe.Sizeof` feature to inspect what the Go runtime thinks,
144+
we can see that `type Foo string` and `type Foo struct{x string}` consume
145+
precisely the same amount of memory.
146+
147+
This is interesting because it means we can choose between either
148+
type definition with no significant overhead anywhere we use it:
149+
thus, we can choose freely between Option B and Option F based on which
150+
we feel is more pleasant to work with.
151+
152+
Option F (a struct wrapper) means we can prevent casting into our Cid type.
153+
Option B (typedef string) can be declared a `const`.
154+
Are there any other concerns that would separate the two choices?
155+
132156
### one way or another: let's get rid of that star
133157

134158
We should switch completely to handling `Cid` and remove `*Cid` completely.

0 commit comments

Comments
 (0)