Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more strict verifications when user provides a platform #336

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

wagoodman
Copy link
Contributor

@wagoodman wagoodman commented Jan 4, 2025

Today we allow for users to specify a platform for images being resolved. The expectation is that if the fetched/resolved image does not match the requested platform then we should error out and not continue. Bugs were found in both how the Docker daemon provider (thus Podman as well since this is shared code) as well as the OCI registry provider. Though the bugs were slightly different, at a high level they were the same: a user could provide a platform to fetch and the provider in some cases would ignore this platform and fetch the image for a different platform.

For the Docker daemon provider when monitoring the pull event status JSONL stream the Error field was not being considered, which is where this kind of error is raised up.

For the OCI registry provider, passing the remote.WithPlatform option only sets the platform field on the v1.Descriptor object, but does not have any effect on what is fetched in the case where there is no manifest list or index (there is a single architecture in the registry). In the case of manifest list or index the correct platform was being resolved. The change here was to explicitly pull the container config and validate against the os and architecture fields (which are required via the OCI spec).

An additional step within the OCI registry provider is checking if the manifest is a list/index or a single manifest. If the user does NOT give a platform then the current default behavior for this provider is to set one based off of linux/<GOARCH>. The intended behavior is to honor what the single-architecture manifest instead of overriding with a default value. This PR enforces this intended behavior by clearing the platform options after fetching the manifest and checking the MediaType for single vs multi arch support.

The last change is to allow for providers to express errors that indicate that the image was fully resolved by the provider but there was a transient or fundamental error. This new error is intended to signal to the caller that no further providers in a set of providers should be attempted, and to raise up this single error. For instance, if we tried 7 providers with 7 "not found" errors and we reach the Docker provider and it raises up a platform mismatch error, we don't want to attempt the Registry provider since we expect a similar outcome. This fosters a fail-fast approach for cases we know the user should be paying attention to (and not get lost in a pile of errors from multiple providers).

@wagoodman wagoodman added the bug Something isn't working label Jan 4, 2025
@wagoodman wagoodman requested a review from a team January 4, 2025 03:22
@wagoodman wagoodman self-assigned this Jan 4, 2025
Copy link

github-actions bot commented Jan 4, 2025

Benchmark Test Results

Benchmark results from the latest changes vs base branch
make .tool/task
make[1]: Entering directory '/home/runner/work/stereoscope/stereoscope'
make[1]: Leaving directory '/home/runner/work/stereoscope/stereoscope'
.tool/task show-benchstat
?   	github.com/anchore/stereoscope	[no test files]
?   	github.com/anchore/stereoscope/examples	[no test files]
PASS
ok  	github.com/anchore/stereoscope/internal	0.003s
?   	github.com/anchore/stereoscope/internal/bus	[no test files]
PASS
ok  	github.com/anchore/stereoscope/internal/containerd	0.007s
PASS
ok  	github.com/anchore/stereoscope/internal/docker	0.004s
?   	github.com/anchore/stereoscope/internal/log	[no test files]
PASS
ok  	github.com/anchore/stereoscope/internal/podman	0.005s
?   	github.com/anchore/stereoscope/pkg/event	[no test files]
?   	github.com/anchore/stereoscope/pkg/event/parsers	[no test files]
goos: linux
goarch: amd64
pkg: github.com/anchore/stereoscope/pkg/file
cpu: AMD EPYC 7763 64-Core Processor                
BenchmarkTarIndex-4   	   33794	     35693 ns/op	    5699 B/op	      93 allocs/op
BenchmarkTarIndex-4   	   33688	     35977 ns/op	    5700 B/op	      93 allocs/op
BenchmarkTarIndex-4   	   33681	     35609 ns/op	    5699 B/op	      93 allocs/op
BenchmarkTarIndex-4   	   33562	     35754 ns/op	    5700 B/op	      93 allocs/op
BenchmarkTarIndex-4   	   33475	     35619 ns/op	    5701 B/op	      93 allocs/op
BenchmarkTarIndex-4   	   33616	     35618 ns/op	    5700 B/op	      93 allocs/op
BenchmarkTarIndex-4   	   33546	     35719 ns/op	    5701 B/op	      93 allocs/op
PASS
ok  	github.com/anchore/stereoscope/pkg/file	10.949s
PASS
ok  	github.com/anchore/stereoscope/pkg/filetree	0.005s
?   	github.com/anchore/stereoscope/pkg/filetree/filenode	[no test files]
PASS
ok  	github.com/anchore/stereoscope/pkg/image	0.005s
PASS
ok  	github.com/anchore/stereoscope/pkg/image/containerd	0.008s
PASS
ok  	github.com/anchore/stereoscope/pkg/image/docker	0.005s
PASS
ok  	github.com/anchore/stereoscope/pkg/image/oci	0.006s
PASS
ok  	github.com/anchore/stereoscope/pkg/image/oci/credhelpers	0.005s
?   	github.com/anchore/stereoscope/pkg/image/podman	[no test files]
PASS
ok  	github.com/anchore/stereoscope/pkg/image/sif	0.004s
?   	github.com/anchore/stereoscope/pkg/imagetest	[no test files]
PASS
ok  	github.com/anchore/stereoscope/pkg/tree	0.003s
PASS
ok  	github.com/anchore/stereoscope/pkg/tree/node	0.003s
goos: linux
goarch: amd64
pkg: github.com/anchore/stereoscope/test/integration
cpu: AMD EPYC 7763 64-Core Processor                
BenchmarkSimpleImage_GetImage/docker-archive-4 	    1099	   1128208 ns/op	  272037 B/op	    2248 allocs/op
BenchmarkSimpleImage_GetImage/docker-archive-4 	    1012	   1119166 ns/op	  271577 B/op	    2247 allocs/op
BenchmarkSimpleImage_GetImage/docker-archive-4 	    1095	   1083738 ns/op	  271529 B/op	    2247 allocs/op
BenchmarkSimpleImage_GetImage/docker-archive-4 	    1107	   1077549 ns/op	  271377 B/op	    2246 allocs/op
BenchmarkSimpleImage_GetImage/docker-archive-4 	    1107	   1080354 ns/op	  271275 B/op	    2246 allocs/op
BenchmarkSimpleImage_GetImage/docker-archive-4 	    1104	   1156067 ns/op	  271064 B/op	    2246 allocs/op
BenchmarkSimpleImage_GetImage/docker-archive-4 	    1095	   1133477 ns/op	  271139 B/op	    2246 allocs/op
BenchmarkSimpleImage_GetImage/podman-4         	      69	  17003051 ns/op	  403055 B/op	    2698 allocs/op
BenchmarkSimpleImage_GetImage/podman-4         	      67	  17114670 ns/op	  401893 B/op	    2698 allocs/op
BenchmarkSimpleImage_GetImage/podman-4         	      68	  16873266 ns/op	  402413 B/op	    2698 allocs/op
BenchmarkSimpleImage_GetImage/podman-4         	      66	  18223165 ns/op	  401220 B/op	    2695 allocs/op
BenchmarkSimpleImage_GetImage/podman-4         	      68	  16918641 ns/op	  401624 B/op	    2696 allocs/op
BenchmarkSimpleImage_GetImage/podman-4         	      69	  16828658 ns/op	  400376 B/op	    2695 allocs/op
BenchmarkSimpleImage_GetImage/podman-4         	      68	  16907945 ns/op	  400749 B/op	    2695 allocs/op
#0 building with "default" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 345B done
#1 DONE 0.0s

#2 [internal] load .dockerignore
#2 transferring context: 2B done
#2 DONE 0.0s

#3 [internal] load build context
#3 transferring context: 209B done
#3 DONE 0.0s

#4 [1/3] ADD file-1.txt /somefile-1.txt
#4 CACHED

#5 [2/3] ADD file-2.txt /somefile-2.txt
#5 CACHED

#6 [3/3] ADD target /
#6 CACHED

#7 exporting to image
#7 exporting layers done
#7 writing image sha256:b73cce0285fb928e8532c03b75a0d22be8ed03d7c52f007b19edf4b12b050d32 done
#7 naming to docker.io/library/stereoscope-fixture-image-simple:04e16e44161c8888a1a963720fd0443cbf7eef8101434c431de8725cd98cc9f7 done
#7 naming to docker.io/library/stereoscope-fixture-image-simple:latest done
#7 DONE 0.0s
ctr: failed to dial "/run/containerd/containerd.sock": connection error: desc = "transport: error while dialing: dial unix /run/containerd/containerd.sock: connect: permission denied"
--- FAIL: BenchmarkSimpleImage_GetImage
    image_fixtures.go:193: using existing image tar: 'test-fixtures/cache/stereoscope-fixture-image-simple-04e16e44161c8888a1a963720fd0443cbf7eef8101434c431de8725cd98cc9f7.tar' (size: 22528, modified: 2025-01-04 04:41:42.042960017 +0000 UTC, mode: -rw-r--r--)
    image_fixtures.go:241: Build docker image: name="stereoscope-fixture-image-simple" tag="04e16e44161c8888a1a963720fd0443cbf7eef8101434c431de8725cd98cc9f7"
    image_fixtures.go:291: saveImage running: docker image save stereoscope-fixture-image-simple:04e16e44161c8888a1a963720fd0443cbf7eef8101434c431de8725cd98cc9f7
    image_fixtures.go:286: 
        	Error Trace:	/home/runner/work/stereoscope/stereoscope/pkg/imagetest/image_fixtures.go:286
        	            				/home/runner/work/stereoscope/stereoscope/pkg/imagetest/image_fixtures.go:162
        	            				/home/runner/work/stereoscope/stereoscope/pkg/imagetest/image_fixtures.go:152
        	            				/home/runner/work/stereoscope/stereoscope/pkg/imagetest/image_fixtures.go:33
        	            				/home/runner/work/stereoscope/stereoscope/test/integration/fixture_image_simple_test.go:163
        	Error:      	Received unexpected error:
        	            	exit status 1
        	Test:       	BenchmarkSimpleImage_GetImage
        	Messages:   	could not import docker image to containerd (shell out)
BenchmarkSimpleImage_FetchSquashedContents/docker-archive-4         	   53739	     22263 ns/op	    2712 B/op	      21 allocs/op
BenchmarkSimpleImage_FetchSquashedContents/docker-archive-4         	   53659	     22274 ns/op	    2712 B/op	      21 allocs/op
BenchmarkSimpleImage_FetchSquashedContents/docker-archive-4         	   53469	     22177 ns/op	    2712 B/op	      21 allocs/op
BenchmarkSimpleImage_FetchSquashedContents/docker-archive-4         	   53535	     22108 ns/op	    2712 B/op	      21 allocs/op
BenchmarkSimpleImage_FetchSquashedContents/docker-archive-4         	   53768	     22244 ns/op	    2712 B/op	      21 allocs/op
BenchmarkSimpleImage_FetchSquashedContents/docker-archive-4         	   53662	     22196 ns/op	    2712 B/op	      21 allocs/op
BenchmarkSimpleImage_FetchSquashedContents/docker-archive-4         	   53750	     22237 ns/op	    2712 B/op	      21 allocs/op
BenchmarkSimpleImage_FetchSquashedContents/podman-4                 	   53220	     22204 ns/op	    2712 B/op	      21 allocs/op
BenchmarkSimpleImage_FetchSquashedContents/podman-4                 	   54104	     22167 ns/op	    2712 B/op	      21 allocs/op
BenchmarkSimpleImage_FetchSquashedContents/podman-4                 	   53487	     22191 ns/op	    2712 B/op	      21 allocs/op
BenchmarkSimpleImage_FetchSquashedContents/podman-4                 	   53808	     22236 ns/op	    2712 B/op	      21 allocs/op
BenchmarkSimpleImage_FetchSquashedContents/podman-4                 	   53929	     22145 ns/op	    2712 B/op	      21 allocs/op
BenchmarkSimpleImage_FetchSquashedContents/podman-4                 	   53862	     22113 ns/op	    2712 B/op	      21 allocs/op
BenchmarkSimpleImage_FetchSquashedContents/podman-4                 	   54046	     22270 ns/op	    2712 B/op	      21 allocs/op
#0 building with "default" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 345B done
#1 DONE 0.0s

#2 [internal] load .dockerignore
#2 transferring context: 2B done
#2 DONE 0.0s

#3 [internal] load build context
#3 transferring context: 209B done
#3 DONE 0.0s

#4 [1/3] ADD file-1.txt /somefile-1.txt
#4 CACHED

#5 [2/3] ADD file-2.txt /somefile-2.txt
#5 CACHED

#6 [3/3] ADD target /
#6 CACHED

#7 exporting to image
#7 exporting layers done
#7 writing image sha256:b73cce0285fb928e8532c03b75a0d22be8ed03d7c52f007b19edf4b12b050d32 done
#7 naming to docker.io/library/stereoscope-fixture-image-simple:04e16e44161c8888a1a963720fd0443cbf7eef8101434c431de8725cd98cc9f7 done
#7 naming to docker.io/library/stereoscope-fixture-image-simple:latest done
#7 DONE 0.0s
ctr: failed to dial "/run/containerd/containerd.sock": connection error: desc = "transport: error while dialing: dial unix /run/containerd/containerd.sock: connect: permission denied"
--- FAIL: BenchmarkSimpleImage_FetchSquashedContents
    image_fixtures.go:193: using existing image tar: 'test-fixtures/cache/stereoscope-fixture-image-simple-04e16e44161c8888a1a963720fd0443cbf7eef8101434c431de8725cd98cc9f7.tar' (size: 22528, modified: 2025-01-04 04:41:42.042960017 +0000 UTC, mode: -rw-r--r--)
    image_fixtures.go:241: Build docker image: name="stereoscope-fixture-image-simple" tag="04e16e44161c8888a1a963720fd0443cbf7eef8101434c431de8725cd98cc9f7"
    image_fixtures.go:291: saveImage running: docker image save stereoscope-fixture-image-simple:04e16e44161c8888a1a963720fd0443cbf7eef8101434c431de8725cd98cc9f7
    image_fixtures.go:286: 
        	Error Trace:	/home/runner/work/stereoscope/stereoscope/pkg/imagetest/image_fixtures.go:286
        	            				/home/runner/work/stereoscope/stereoscope/pkg/imagetest/image_fixtures.go:162
        	            				/home/runner/work/stereoscope/stereoscope/pkg/imagetest/image_fixtures.go:152
        	            				/home/runner/work/stereoscope/stereoscope/pkg/imagetest/image_fixtures.go:33
        	            				/home/runner/work/stereoscope/stereoscope/pkg/imagetest/image_fixtures.go:64
        	            				/home/runner/work/stereoscope/stereoscope/test/integration/fixture_image_simple_test.go:189
        	Error:      	Received unexpected error:
        	            	exit status 1
        	Test:       	BenchmarkSimpleImage_FetchSquashedContents
        	Messages:   	could not import docker image to containerd (shell out)
FAIL
exit status 1
FAIL	github.com/anchore/stereoscope/test/integration	39.067s
?   	github.com/anchore/stereoscope/test/integration/test-fixtures/registry	[no test files]
FAIL
goos: linux
goarch: amd64
pkg: github.com/anchore/stereoscope/pkg/file
cpu: AMD EPYC 7763 64-Core Processor                
ctr: 
           │ .tmp/benchmark-e3471d1.txt │
           │           sec/op           │
TarIndex-4                  35.69µ ± 1%

           │ .tmp/benchmark-e3471d1.txt │
           │            B/op            │
TarIndex-4                 5.566Ki ± 0%

           │ .tmp/benchmark-e3471d1.txt │
           │         allocs/op          │
TarIndex-4                   93.00 ± 0%

pkg: github.com/anchore/stereoscope/test/integration
                                      │ .tmp/benchmark-e3471d1.txt │
                                      │           sec/op           │
SimpleImage_GetImage/docker-archive-4                  1.119m ± 4%
SimpleImage_GetImage/podman-4                          16.92m ± 8%
geomean                                                4.351m

                                      │ .tmp/benchmark-e3471d1.txt │
                                      │            B/op            │
SimpleImage_GetImage/docker-archive-4                 265.0Ki ± 0%
SimpleImage_GetImage/podman-4                         392.2Ki ± 0%
geomean                                               322.4Ki

                                      │ .tmp/benchmark-e3471d1.txt │
                                      │         allocs/op          │
SimpleImage_GetImage/docker-archive-4                  2.246k ± 0%
SimpleImage_GetImage/podman-4                          2.696k ± 0%
geomean                                                2.461k

ctr: failed to dial "/run/containerd/containerd.sock": connection error: desc = "transport: error while dialing: dial unix /run/containerd/containerd.sock: connect: permission denied"
                                                   │ .tmp/benchmark-e3471d1.txt │
                                                   │           sec/op           │
SimpleImage_FetchSquashedContents/docker-archive-4                  22.24µ ± 1%
SimpleImage_FetchSquashedContents/podman-4                          22.19µ ± 0%
geomean                                                             22.21µ

                                                   │ .tmp/benchmark-e3471d1.txt │
                                                   │            B/op            │
SimpleImage_FetchSquashedContents/docker-archive-4                 2.648Ki ± 0%
SimpleImage_FetchSquashedContents/podman-4                         2.648Ki ± 0%
geomean                                                            2.648Ki

                                                   │ .tmp/benchmark-e3471d1.txt │
                                                   │         allocs/op          │
SimpleImage_FetchSquashedContents/docker-archive-4                   21.00 ± 0%
SimpleImage_FetchSquashedContents/podman-4                           21.00 ± 0%
geomean                                                              21.00
goos: linux
goarch: amd64
pkg: github.com/anchore/stereoscope/pkg/file
cpu: AMD EPYC 7763 64-Core Processor                
ctr: 
           │ .tmp/benchmark-e3471d1.txt │
           │           sec/op           │
TarIndex-4                  35.69µ ± 1%

           │ .tmp/benchmark-e3471d1.txt │
           │            B/op            │
TarIndex-4                 5.566Ki ± 0%

           │ .tmp/benchmark-e3471d1.txt │
           │         allocs/op          │
TarIndex-4                   93.00 ± 0%

pkg: github.com/anchore/stereoscope/test/integration
                                      │ .tmp/benchmark-e3471d1.txt │
                                      │           sec/op           │
SimpleImage_GetImage/docker-archive-4                  1.119m ± 4%
SimpleImage_GetImage/podman-4                          16.92m ± 8%
geomean                                                4.351m

                                      │ .tmp/benchmark-e3471d1.txt │
                                      │            B/op            │
SimpleImage_GetImage/docker-archive-4                 265.0Ki ± 0%
SimpleImage_GetImage/podman-4                         392.2Ki ± 0%
geomean                                               322.4Ki

                                      │ .tmp/benchmark-e3471d1.txt │
                                      │         allocs/op          │
SimpleImage_GetImage/docker-archive-4                  2.246k ± 0%
SimpleImage_GetImage/podman-4                          2.696k ± 0%
geomean                                                2.461k

ctr: failed to dial "/run/containerd/containerd.sock": connection error: desc = "transport: error while dialing: dial unix /run/containerd/containerd.sock: connect: permission denied"
                                                   │ .tmp/benchmark-e3471d1.txt │
                                                   │           sec/op           │
SimpleImage_FetchSquashedContents/docker-archive-4                  22.24µ ± 1%
SimpleImage_FetchSquashedContents/podman-4                          22.19µ ± 0%
geomean                                                             22.21µ

                                                   │ .tmp/benchmark-e3471d1.txt │
                                                   │            B/op            │
SimpleImage_FetchSquashedContents/docker-archive-4                 2.648Ki ± 0%
SimpleImage_FetchSquashedContents/podman-4                         2.648Ki ± 0%
geomean                                                            2.648Ki

                                                   │ .tmp/benchmark-e3471d1.txt │
                                                   │         allocs/op          │
SimpleImage_FetchSquashedContents/docker-archive-4                   21.00 ± 0%
SimpleImage_FetchSquashedContents/podman-4                           21.00 ± 0%
geomean                                                              21.00

@wagoodman wagoodman force-pushed the fix-platform-options branch from a7b731e to 9c7e279 Compare January 4, 2025 04:40
// if the caller has a set of providers, it can try another provider) and a provider that can resolve an image but
// there is an unresolvable problem (e.g. network error, mismatched architecture, etc... thus the caller should
// not try any further providers).
type ErrFetchingImage struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we should add behavior that could result in one provider causing other providers to get bypassed. The source search, to me, is "try whatever you can to find the right thing". Part of the reason to isolate providers to the specific thing they provide is to make it easier to add other providers that might fill in gaps these providers have and isolate the logic so they shouldn't affect other providers.

The scenario, as I understand it is this: we attempt to resolve an image, but we can't because we can't match the correct platform. So, say the docker daemon provider determines this, as an example... what if we built a matching image without the correct platform locally. So running syft <image> would resolve our locally built image, but fails because the platform was wrong. If we return a "stop" error, the registry provider doesn't run and we potentially don't resolve a valid image that existed in a remote registry. I don't think we should assume the user wanted the locally built image -- they can easily restrict what they're searching using --from docker or --from registry to force specific providers if there's a reason to, like only checking a locally-built image.

I think the thing this is trying to prevent is downloading large files twice from two different providers, right? Is this because they don't know the platform until downloading the whole thing? This seems like a different change: the registry provider should download the container metadata to ensure it actually found the right thing, with the right platform, before downloading any layer blobs. Maybe this is a hard change, but is it really a problem if both of these download an image? To me this is eschewing performance in favor of correctness and I'd point out that I don't think it affects enterprise, since I don't think they are using the daemon provider at all, only syft users who are explicitly specifying an incorrect platform.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we return a "stop" error, the registry provider doesn't run and we potentially don't resolve a valid image that existed in a remote registry.

That makes sense, I can back this part out -- but from a stereoscope point of view this isn't really changing behavior. It's allowing the caller to make this decision. We could leave this error in and not act on it in syft too as another option? But I agree with your overall point that within the stereoscope/syft ecosystem we're intending the providers to be independent of one another.

the registry provider should download the container metadata to ensure it actually found the right thing, with the right platform, before downloading any layer blobs

actually, that is how it works today -- remote.Get() only grabs the manifest, then we now additionally grab the container config, but the layer blobs are not requested until we start indexing them (only the registry provider does this today, the others grab them all at once).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point: the comment/behavior here isn't actually what would be responsible for it to happen; could this instead then capture the original error rather than just the message?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

2 participants