Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IPFS as a remote cache #3510

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

felipecruz91
Copy link

@felipecruz91 felipecruz91 commented Jan 16, 2023

Hi,

I'd like to open this PR to introduce IPFS as an additional remote cache for BuildKit. It contains a first implementation to export and import the build cache to IPFS and replicate it among all the peers.

Motivation

The main motivation is to explore new ways to achieve faster build times by leveraging the use of IPSF as a remote, distributed cache of image blobs among multiple peers of a cluster.

Use-case

In a company, software developers that work on the same project happen to build an application that is likely to have been already built by a teammate/peer. Exporting the BuildKit cache to a remote registry is a convenient solution, however, sometimes it can be time-consuming to download the cache, especially when building large images.

By distributing the cache using IPFS across many cluster peers, the blocks that compose the blobs can be downloaded in parallel from multiple peers instead of from one single place (a remote registry).

Try me

  1. Set up an IPFS cluster if you don't have one. See this example.

  2. Create a builder from felipecruz/buildkit:ipfs-cluster

docker buildx create --name buildkitd-builder --driver docker-container --driver-opt image=felipecruz/buildkit:ipfs-cluster --use
  1. In one host, build an image and export its cache to IPFS
docker buildx build \
  --cache-to type=ipfs,cluster_api=192.168.65.2:9094,daemon_api=192.168.65.2:5001,mode=max \
  -t my-image .
  1. In another host, import the cache
docker buildx build \
  --cache-from type=ipfs,cluster_api=192.168.65.2:9094,daemon_api=192.168.65.2:5001,mode=max \
  -t my-image .

image

/cc @tonistiigi @crazy-max @AkihiroSuda

@AkihiroSuda AkihiroSuda requested a review from ktock January 16, 2023 21:33
@AkihiroSuda
Copy link
Member

Can we just exec the Kubo binary to reduce the Go dependencies?

@tonistiigi
Copy link
Member

tonistiigi commented Jan 16, 2023

Can we just exec the Kubo binary to reduce the Go dependencies?

Could we check what the difference in binary and buildkit image size would be for both cases. Also, maybe there is a smaller tool or a minimal build that we could include in the image.

@AkihiroSuda
Copy link
Member

Can we just exec the Kubo binary to reduce the Go dependencies?

Could we check what the difference in binary and buildkit image size would be for both cases.

Not really for binary footprint, rather for avoiding the vendor hell.

@tonistiigi
Copy link
Member

Not really for binary footprint, rather for avoiding the vendor hell.

Yes, but if including kubo increases binary(image) size too much compared to vendoring then we might still prefer vendoring. We need to know the numbers(and if there are alternatives).

@AkihiroSuda
Copy link
Member

Not really for binary footprint, rather for avoiding the vendor hell.

Yes, but if including kubo increases binary(image) size too much compared to vendoring then we might still prefer vendoring. We need to know the numbers(and if there are alternatives).

Another alternative would be to reimplement the IPFS API client with the stdlib net/http


if !exists {
layerDone := progress.OneOff(ctx, fmt.Sprintf("writing layer %s", l.Blob))
dt, err := content.ReadBlob(ctx, dgstPair.Provider, dgstPair.Descriptor)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use io.Copy here instead of fully reading the layer to slice?

Comment on lines +298 to +310
go func() {
for {
j, more := <-out
if more {
logrus.Debugf("added item: %+v", j)
cid = j.Cid
} else {
logrus.Debugf("added all items")
done <- true
return
}
}
}()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be cancellable via ctx?

}

logrus.Debugf("unpinning previous pin: %s\n", prevPinCID.Cid)
_, err = clusterClient.Unpin(ctx, *prevPinCID)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it enough to only unpinning 1 CID? Isn't it possible that previously multiple CIDs being associated to that pin name?

@felipecruz91
Copy link
Author

felipecruz91 commented Jan 18, 2023

Find below the size comparison:

TL;DR

  • Using vendoring the buildkitd binary size is increased from 50M to 66M (32%).
  • If we were to ship both the ipfs (62.8M) and ipfs-cluster-ctl (32.5M) binaries as part of the moby/buildkit:latest image, it would mean an increase from the current image size from 168MB to 263.3M (56.7%).

See below the binaries used for the comparison:


Details

Binary size without the IPFS implementation.

Branch: master
Commit: 983480b80ad82f98959b49b89bb2af0f84df72d9

Original:

make binaries
...

ls -lh ./bin/
...
-rwxr-xr-x@ 1 felipecruz  staff    50M 18 Jan 16:14 buildkitd

Binary size using IPFS Go libraries (vendoring)

Branch: feature/ipfs-cache
Commit: 492fc6d8b0485d98791c8d605b4cc8ea5a9181b7

make binaries
...

ls -lh ./bin/
...
-rwxr-xr-x@ 1 felipecruz  staff    66M 18 Jan 16:01 buildkitd

@AkihiroSuda
Copy link
Member

If we were to ship both the ipfs (62.8M) and ipfs-cluster-ctl (32.5M) binaries as part of the moby/buildkit:latest image, it would mean an increase from the current image size from 168MB to 263.3M (56.7%).

Maybe these binaries should be only present in a separate image like moby/buildkit:vX.Y.Z-ipfs?

That might be also helpful for some enterprise companies that have "no P2P" policies.

@tonistiigi
Copy link
Member

Maybe these binaries should be only present in a separate image like

If we do that then it probably makes sense to move the s3/azure backends also to that image (cc @bpaquet) . And include things like Nydus if they are ready that is somewhat supported today but not in release image(cc @hsiangkao ).

@hsiangkao
Copy link

hsiangkao commented Jan 20, 2023

Maybe these binaries should be only present in a separate image like

If we do that then it probably makes sense to move the s3/azure backends also to that image (cc @bpaquet) . And include things like Nydus if they are ready that is somewhat supported today but not in release image(cc @hsiangkao ).

Hi! Actually I'm not responsible for main Nydus implementation (One part of my main jobs is in-kernel EROFS), I'm Ccing proper Nydus people here. cc @imeoer @jiangliu @changweige

@imeoer
Copy link
Contributor

imeoer commented Jan 20, 2023

@tonistiigi @AkihiroSuda If image size is a concern, nydus is also ready to build related binary into a separate image like moby/buildkit:vX.Y.Z-nydus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants