Skip to content

Conversation

@rchincha
Copy link
Contributor

@rchincha rchincha commented Feb 6, 2025

Motivations for this PR:

  1. Blake3 is a high performance hash and there is growing community interest
  2. Blake3 is variable output, but mandate 256-bit output

Copy link
Contributor

@sudo-bmitch sudo-bmitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should specify that implementations "MAY" support the algorithm, and specify the encoded value regexp, similar to the sha512 definition.

@rchincha rchincha force-pushed the blake3 branch 2 times, most recently from 2733aae to a2ce39a Compare February 7, 2025 05:51
Copy link
Member

@tianon tianon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A minor typo fix (wrong number of bits in the regex 🙈), a little whitespace pedanticism (that I'm hoping @sudo-bmitch will confirm or reject/deny), and what can probably/hopefully just be a discussion of the URL to link to (not necessarily requesting any change there).

Overall the change looks good and I'm +1; thanks for taking a stab!

@rchincha rchincha force-pushed the blake3 branch 2 times, most recently from b20ba52 to 139e306 Compare February 13, 2025 03:41
Copy link
Contributor

@sudo-bmitch sudo-bmitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor white-space nit to align paragraphs with other registered algorithms. Otherwise LGTM.

sudo-bmitch
sudo-bmitch previously approved these changes Feb 14, 2025
When the _algorithm identifier_ is `sha512`, the _encoded_ portion MUST match `/[a-f0-9]{128}/`.
Note that `[A-F]` MUST NOT be used here.

#### BLAKE3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the new section, you'll want a new entry in the registered-algorithms table up around line 140, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

descriptor.md Outdated
[rfc7230-s2.7]: https://tools.ietf.org/html/rfc7230#section-2.7
[sha256-vs-sha512]: https://groups.google.com/a/opencontainers.org/forum/#!topic/dev/hsMw7cAwrZE
[iana]: https://www.iana.org/assignments/media-types/media-types.xhtml
[blake3]: https://github.com/C2SP/C2SP/blob/main/BLAKE3.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spinning out of this earlier thread:

Linking to the https://github.com/BLAKE3-team/BLAKE3-specs or https://github.com/C2SP/C2SP/blob/main/BLAKE3.md would each make sense to me.

I'm not an approver, so feel free to ignore me, but personally, having a versioned spec that links to a floating document makes me a bit concerned about image-spec v1.2.0 (or whatever image-spec release eventually ships this registration) being a moving target. Can we pin a specific version of the BLAKE3 spec, e.g. via their BLAKE3/v1.0.0 tag with https://github.com/C2SP/C2SP/blob/BLAKE3/v1.0.0/BLAKE3.md ? It currently matches their main content for that file, and if they tweak the file in the future, e.g. with a BLAKE3/v1.1.0, or dev work in preparation for such a release, it wouldn't get retroactively sucked into image-spec v1.2.0, and would take explicit decisions by image-spec maintainers and implementors to pull in whatever the new BLAKE3 changes were.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Motivations for this PR:
1. Blake3 is a high performance hash and there is growing community
   interest
2. Blake3 is variable output but mandate 256-bit output

Signed-off-by: Ramkumar Chinchani <[email protected]>

[BLAKE3][blake3] is a high performance, highly parallelizable, collision-resistant hash function which [is more performant][blake3-vs-sha2] than
[SHA-256][rfc4634-s4.1].
The hash output length MUST be 256 bits.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with BLAKE3, but going through the spec, it seems like we might also need to specify the hash hashing mode? Or is it sufficiently obvious from the context that there's no provision for supplying a key or other input, making the keyed_hash and derive_key modes unfeasible?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it's pretty obvious that this is and has to be unkeyed (as you said, there's no way to specify a key).

Copy link
Contributor Author

@rchincha rchincha Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to sha1 or sha2 right?

The purpose of the hash here is content-addressability (with non-collision guarantees)

You can still build a keyed hash out of these - https://datatracker.ietf.org/doc/html/rfc2104

[rfc7230-s2.7]: https://tools.ietf.org/html/rfc7230#section-2.7
[sha256-vs-sha512]: https://groups.google.com/a/opencontainers.org/forum/#!topic/dev/hsMw7cAwrZE
[iana]: https://www.iana.org/assignments/media-types/media-types.xhtml
[blake3]: https://github.com/C2SP/C2SP/blob/BLAKE3/v1.0.0/BLAKE3.md
Copy link
Member

@sajayantony sajayantony Feb 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an RFC we can point to? Also I'm not sure if but can an expert here articulate the difference between https://www.ietf.org/archive/id/draft-aumasson-blake3-00.html and a GitHub link above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the commentary, RFC was submitted for process sake, but deemed ineffective/unnecessary. So this repo may be source of truth afaik.

@tianon tianon merged commit 64294bd into opencontainers:main Apr 24, 2025
4 checks passed
rchincha added a commit to rchincha/go-digest that referenced this pull request Apr 25, 2025
rchincha added a commit to rchincha/go-digest that referenced this pull request Apr 25, 2025
rchincha added a commit to project-zot/go-digest that referenced this pull request May 1, 2025
* digest: promote blake3 to first-class digest

The dual module approach for blake3 was slightly awkward. Since it
provides similar usability with a massive bump in performance, it's
extremely likely to land as a registered algorithm in the image-spec.

This PR removes the secondary module, which made it difficult to test as
a unit. This may break users who are using HEAD versions of the package.
For a new release, this will be backwards compatible. The other drawback
is that the zeebo/blake3 will now be a dependency but this can be
replaced transparently by the standard libary in the future.

In addition to promoting blake3, this makes a few style adjustments to
be in line with Go's style guidelines.

Signed-off-by: Stephen Day <[email protected]>

* fix: update stevvooe's blake3 PR

opencontainers#66
opencontainers/image-spec#1240
Signed-off-by: Ramkumar Chinchani <[email protected]>

* fix: add a length test for blake3

Signed-off-by: Ramkumar Chinchani <[email protected]>

* fix: add a Makefile

make
make build
make test

Signed-off-by: Ramkumar Chinchani <[email protected]>

* fix: merge conflicts

Signed-off-by: Ramkumar Chinchani <[email protected]>

* fix: blake3 pulls in golang 1.22.x dep

2025-04-25T19:33:20.2495501Z go: downloading github.com/zeebo/blake3 v0.2.4
2025-04-25T19:33:20.3154128Z go: downloading github.com/klauspost/cpuid/v2 v2.2.10
2025-04-25T19:33:20.3959610Z github.com/klauspost/cpuid/v2: cannot compile Go 1.22 code
2025-04-25T19:33:21.1059807Z FAIL github.com/opencontainers/go-digest [build failed]
2025-04-25T19:33:21.1060449Z FAIL github.com/opencontainers/go-digest/digestset [build failed]
2025-04-25T19:33:21.1219422Z ##[error]Process completed with exit code 1.
Signed-off-by: Ramkumar Chinchani <[email protected]>

---------

Signed-off-by: Stephen Day <[email protected]>
Signed-off-by: Ramkumar Chinchani <[email protected]>
Co-authored-by: Stephen Day <[email protected]>
@tianon
Copy link
Member

tianon commented May 1, 2025

@xnox noted in opencontainers/distribution-spec#574 (comment):

Thus one can use Blake3 hash for hash addressed content - and effectively just a steaming CRC. But cryptographic verification would have to use a different hash for signing.

This is a really interesting point and tangentially I think is grounds to consider reverting this PR (unless this was meant specifically in the context of FIPS, not in general, which is the sense I got in my own research on the topic and does seem consistent with the rest of the comment).

@xnox
Copy link

xnox commented May 1, 2025

@xnox noted in opencontainers/distribution-spec#574 (comment):

Thus one can use Blake3 hash for hash addressed content - and effectively just a steaming CRC. But cryptographic verification would have to use a different hash for signing.

This is a really interesting point and tangentially I think is grounds to consider reverting this PR (unless this was meant specifically in the context of FIPS, not in general, which is the sense I got in my own research on the topic and does seem consistent with the rest of the comment).

Unfortunately Blake2 lost the FIPS SHA-3 contest, Blake3 is much better, but is unlikely to ever have wide adoption in cryptographic libraries and verification. Xxhash on the other hand is a clear winner otherwise without need for cryptographic protection.

The synergies for cryptographic purposes are around SHA-2 today and Shake is the future (due to sensible security properties, and flexibility w.r.t. customized, authenticated, ordered, and parallel hashing properties). And xxhash for non cryptographic hashes.

My comments were general and not scoped specifically to FIPS context. If you look at the CA/Browser forum they too are not enthusiastic about adding choices.

Sometimes less is more.

@rchincha
Copy link
Contributor Author

rchincha commented May 1, 2025

For our use case, we want - 1) content-addressability with guarantees, and 2) very fast

Xxhash on the other hand is a clear winner otherwise without need for cryptographic protection.

Perhaps faster than murmurhash but gives us only 2)

Shake is the future

https://en.wikipedia.org/wiki/SHA-3
Keccak family which gives us 1) but 2) needs h/w acceleration.
We already do SHA2 family so why bother.

Blake3 gives us both without anything special, which was the rationale.

From just compute pov, folks consume container images on almighty cloud and wimpy embedded devices (IoT).

@xnox
Copy link

xnox commented May 2, 2025

For our use case, we want - 1) content-addressability with guarantees, and 2) very fast

Xxhash on the other hand is a clear winner otherwise without need for cryptographic protection.

Perhaps faster than murmurhash but gives us only 2)

Shake is the future

https://en.wikipedia.org/wiki/SHA-3 Keccak family which gives us 1) but 2) needs h/w acceleration. We already do SHA2 family so why bother.

Blake3 gives us both without anything special, which was the rationale.

Whilst excluding 40% of corporate users in USA and Canada and other places, as Blake3 is not FIPS approved. So switching away from SHA2 is always a loss of some market-share and users today.

There is no silver bullet yet. Obviously if one is sure to never use something inside Fedramp or within Common Criteria or within FIPS certification or within Australian/UK cyber security requirements..... then yes blake3 is awesome, as yes whilst slower than xxhash it has collision protection guarantees.

From just compute pov, folks consume container images on almighty cloud and wimpy embedded devices (IoT).

@rchincha
Copy link
Contributor Author

rchincha commented May 2, 2025

FIPS requirements pop up when dealing with govt entities. If so, by all means stay with SHA2.
For the rest of us, we need content-addressability and fast.

image
https://github.com/BLAKE3-team/BLAKE3

git (content-addressability) still defaults to sha1.

--object-format=<format>
Specify the given object <format> (hash algorithm) for the repository. The valid values are sha1 and (if enabled) sha256. sha1 is the default.

In FIPS world, you get a pass until 2030 or maybe longer if sha1 not for security use case.
https://www.nist.gov/news-events/news/2022/12/nist-retires-sha-1-cryptographic-algorithm
https://docs.gitlab.com/development/fips_gitlab/#development-guidelines

@rchincha
Copy link
Contributor Author

rchincha commented May 6, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants