-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more strict verifications when user provides a platform #336
base: main
Are you sure you want to change the base?
Conversation
Benchmark Test ResultsBenchmark results from the latest changes vs base branch
|
Signed-off-by: Alex Goodman <[email protected]>
a7b731e
to
9c7e279
Compare
// if the caller has a set of providers, it can try another provider) and a provider that can resolve an image but | ||
// there is an unresolvable problem (e.g. network error, mismatched architecture, etc... thus the caller should | ||
// not try any further providers). | ||
type ErrFetchingImage struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we should add behavior that could result in one provider causing other providers to get bypassed. The source search, to me, is "try whatever you can to find the right thing". Part of the reason to isolate providers to the specific thing they provide is to make it easier to add other providers that might fill in gaps these providers have and isolate the logic so they shouldn't affect other providers.
The scenario, as I understand it is this: we attempt to resolve an image, but we can't because we can't match the correct platform. So, say the docker daemon provider determines this, as an example... what if we built a matching image without the correct platform locally. So running syft <image>
would resolve our locally built image, but fails because the platform was wrong. If we return a "stop" error, the registry provider doesn't run and we potentially don't resolve a valid image that existed in a remote registry. I don't think we should assume the user wanted the locally built image -- they can easily restrict what they're searching using --from docker
or --from registry
to force specific providers if there's a reason to, like only checking a locally-built image.
I think the thing this is trying to prevent is downloading large files twice from two different providers, right? Is this because they don't know the platform until downloading the whole thing? This seems like a different change: the registry provider should download the container metadata to ensure it actually found the right thing, with the right platform, before downloading any layer blobs. Maybe this is a hard change, but is it really a problem if both of these download an image? To me this is eschewing performance in favor of correctness and I'd point out that I don't think it affects enterprise, since I don't think they are using the daemon provider at all, only syft users who are explicitly specifying an incorrect platform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we return a "stop" error, the registry provider doesn't run and we potentially don't resolve a valid image that existed in a remote registry.
That makes sense, I can back this part out -- but from a stereoscope point of view this isn't really changing behavior. It's allowing the caller to make this decision. We could leave this error in and not act on it in syft too as another option? But I agree with your overall point that within the stereoscope/syft ecosystem we're intending the providers to be independent of one another.
the registry provider should download the container metadata to ensure it actually found the right thing, with the right platform, before downloading any layer blobs
actually, that is how it works today -- remote.Get()
only grabs the manifest, then we now additionally grab the container config, but the layer blobs are not requested until we start indexing them (only the registry provider does this today, the others grab them all at once).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point: the comment/behavior here isn't actually what would be responsible for it to happen; could this instead then capture the original error rather than just the message?
Today we allow for users to specify a platform for images being resolved. The expectation is that if the fetched/resolved image does not match the requested platform then we should error out and not continue. Bugs were found in both how the Docker daemon provider (thus Podman as well since this is shared code) as well as the OCI registry provider. Though the bugs were slightly different, at a high level they were the same: a user could provide a platform to fetch and the provider in some cases would ignore this platform and fetch the image for a different platform.
For the Docker daemon provider when monitoring the pull event status JSONL stream the
Error
field was not being considered, which is where this kind of error is raised up.For the OCI registry provider, passing the
remote.WithPlatform
option only sets the platform field on thev1.Descriptor
object, but does not have any effect on what is fetched in the case where there is no manifest list or index (there is a single architecture in the registry). In the case of manifest list or index the correct platform was being resolved. The change here was to explicitly pull the container config and validate against theos
andarchitecture
fields (which are required via the OCI spec).An additional step within the OCI registry provider is checking if the manifest is a list/index or a single manifest. If the user does NOT give a platform then the current default behavior for this provider is to set one based off of
linux/<GOARCH>
. The intended behavior is to honor what the single-architecture manifest instead of overriding with a default value. This PR enforces this intended behavior by clearing the platform options after fetching the manifest and checking the MediaType for single vs multi arch support.The last change is to allow for providers to express errors that indicate that the image was fully resolved by the provider but there was a transient or fundamental error. This new error is intended to signal to the caller that no further providers in a set of providers should be attempted, and to raise up this single error. For instance, if we tried 7 providers with 7 "not found" errors and we reach the Docker provider and it raises up a platform mismatch error, we don't want to attempt the Registry provider since we expect a similar outcome. This fosters a fail-fast approach for cases we know the user should be paying attention to (and not get lost in a pile of errors from multiple providers).