-
Notifications
You must be signed in to change notification settings - Fork 395
Add functionality to search registries #406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
mtrmac
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A very quick look; I didn’t really read SearchRegistries at all.
docker/docker_client.go
Outdated
| // detectProperties detects various properties of the registry. | ||
| // See the dockerClient documentation for members which are affected by this. | ||
| func (c *dockerClient) detectProperties(ctx context.Context) error { | ||
| func (c *dockerClient) detectProperties(ctx context.Context, isSearch bool) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use a name for the parameter which describes what the parameter means/does, not what it is used for; allowV1 or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
docker/docker_client.go
Outdated
| if resp.StatusCode != http.StatusOK { | ||
| return nil, nil, errors.Errorf("error getting response from endpoint") | ||
| } | ||
| if isV2 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way this is split, with the URL path hard-coded in the caller and the data format hard-coded in the callee, is weird. Maybe instead the front part of getRegistryData should become a “GET + check for StatusOK” helper?
Why do the V1 and V2 requests use a separate dockerClient anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
restructured it quite a bit. Should be fixed now.
docker/docker_client.go
Outdated
| func getRegistryData(ctx context.Context, sCtx *types.SystemContext, registry, path string, isV2 bool) (*V2Data, *V1Data, error) { | ||
| v2Data := &V2Data{} | ||
| v1Data := &V1Data{} | ||
| newLoginClient, err := newDockerClientWithDetails(sCtx, registry, "", "", "", nil, "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
“login” seems incorrect. Maybe just client, or even c?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
docker/docker_client.go
Outdated
| } | ||
| if isV1 { | ||
| // can talk to v1 registry if doing a search | ||
| if isV1 && !isSearch { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I’m being stupid but I can’t see how this works; here err != nil necessarily because we got into this branch, so this only changes what is the reported error but not that the method fails, i.e. newLoginClient.makeRequest should AFAICT fail for V1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understand, detectPropertires() is pinging a v2 and a v1 endpoint to detect the api type of the registry. When v2 fails and it detects its a v1, it fails it as we didn't want to support v1. But with search, if it detects its a v1 it shouldn't fail.
I was failing before I added the condition to allowV1 if its the search call.
Maybe there is another way to handle it that I am missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
… Ah, I see what’s going on: The currently relevant registries like docker.io and registry.access.redhat.com also/primarily serve content using the docker/distribution /v2/ API as well, so we never get here.
If we are happy enough to limit ourselves such V2 + /v1/search registries, the allowV1 boolean does not need to exist at all.
BTW the https://docs.docker.com/v1.4/reference/api/docker-io_api/ document you found says that only Authorization: Basic is supported, making both the ping and maybe all of setupRequestAuth seemingly unnecessary. OTOH https://github.com/docker/docker-registry/blob/master/docker_registry/toolkit.pyhttps://github.com/docker/docker-registry/blob/master/docker_registry/toolkit.py#L276L276 does something entirely different, an X-Signature header or Authorization: Token — and /v1/search does not require any authentication anyway.
Similarly /v1/search requires authentication on neither index.docker.io nor r.a.rh.com.
moby/moby seems to support both v2-like and v1-like authentication for the /v1/search URL. I guess we’ll eventually find out which ones, if any, matter; that doesn't need to block this PR.
0f16d96 to
2d733a3
Compare
docker/docker_client.go
Outdated
| } | ||
|
|
||
| // Data holds the output of both the v2 and v1 endpoints | ||
| type Data struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v1 and v2 should not share a type when the contents can/should never appear together in returned results. This is still quite confusing. Please just do the straightforward thing of implementing two independent requests in sequence; then figure out which parts can be shared / shortened.
Maybe (with functionality inline/in separate functions as appropriate for size/readability/repetitions):
type v1Results struct {…}
type v2Results struct {…}
for _, reg := range registries {
client := …
logrus.Debug("…v2…")
body, err := c.makeGETRequest("…/v2/…")
if err == nil {
/* decode into a v2Results variable */
searchResults[reg] = …
} else {
logrus.Debug("…v1…")
body, err := c.makeGETRequest("…/v1/…")
if err == nil {
/* decode into a v1Results variable */
searchResults[reg] = …
} /* else $something */
}with a hypothetical dockerClient.makeGETRequest which does makeRequest+ the StatusOK check. Maybe that will turn out not to be worth it. Or, maybe, something like
func (c *dockerClient) makeGETJSONRequest(ctx context.Context, path string, dest interface{}) errorwhich does makeRequest, the StatusOK check, and json.Unmarshal; that could be used in a few existing cases as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed using the first method you suggested.
docker/docker_client.go
Outdated
| } | ||
|
|
||
| // Results holds the information of each matching image returned by the v1 endpoint | ||
| type Results struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use a bit more specific name than docker.Results, maybe SearchResults.
And this is a single result, a member of an array of results, so it should be named in singular.
(Non-blocking: ”by the v1 endpoint“ seems incorrect as far as the public type of the return value of SearchRegistries goes (but noting that the type matches the v1 search result type exactly would be fine).)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
docker/docker_client.go
Outdated
| IsOfficial bool `json:"is_official"` | ||
| } | ||
|
|
||
| // SearchRegistries queries a list of registries for a matching image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should document a bit more what image means. Is it a substring? regex? something else? (Surprisingly, the Python implementation in https://github.com/docker/docker-registry/blob/master/docker_registry/lib/index/db.py#L156, but not r.a.rh.com, seems to accept a fragment of SQL LIKE syntax!! I guess that wasn’t intentional 😉 ) https://docs.docker.com/engine/reference/commandline/search/#parent-command at least says “a name containing …“.
Similarly for limit; and shouldn’t that be an integer?
And what are the keys/values in the returned map? Is the order of the members in the array significant?
Non-blocking because authoritative documentation may not be available. (See also below WRT searching a single registry vs. a list of them.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documented. Well limit needs to be a string in the query so I convert it before sending it over here. Can leave it as in integer if that is preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the semantics of the limit field is an integer, it would be cleaner for the caller to pass an integer. Converting an integer value into a string is for SearchRegistries to worry about, just like URL-encoding the query and asking over HTTP are things internal to SearchRegistries and not exposed to callers.
s/number of queries/number of results/ perhaps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
docker/docker_client.go
Outdated
| for _, reg := range registries { | ||
| url := reg | ||
| if strings.Contains(reg, "docker") { | ||
| url = "index.docker.io" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition would match local-docker-mirror.example.com as well; this needs to be an exact match.
Also, the registry value passed to newDockerClientWithDetails is used for looking up certificates, and with newDockerClientForRef also for looking up usernames/passwords. So the mapping to index.docker.io needs to happen after these lookups, perhaps somewhere around the dockerHostname/dockerRegistry special case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took this out of here and doing it in podman. User has to send the actual registry url for the search to happen. Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does mapping the hostname in podman work with the certificate and username/password lookup?
(I guess it doesn’t, strictly speaking, matter, because we assume the search/catalog endpoints to be completely unauthenticated… still, it seems unclean.)
docker/docker_client.go
Outdated
| url = "index.docker.io" | ||
| } | ||
|
|
||
| logrus.Debugf("pinging v2 endpoint\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Nit: This seems left over, the same message is logged inside getRegistryData.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
docker/docker_client.go
Outdated
| // SearchRegistries queries a list of registries for a matching image | ||
| func SearchRegistries(ctx context.Context, sCtx *types.SystemContext, registries []string, image, limit string) (map[string][]Results, error) { | ||
| searchResults := make(map[string][]Results) | ||
| for _, reg := range registries { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a benefit to having a registries array, and returning a map of results? It doesn’t hurt, really, but this could just as well be a SearchRegistry(… registry string …) ([]Results, error), with the loop in the caller, and it seems to me that this might be simpler for both the caller and the callee.
AFAICT the caller needs to make a loop through registries, or through the keys of the returned map, anyway, just to process the results, and:
- the single-registry-seach variant would mean that the caller wouldn’t have to worry about
result[registry]being unset - it could also return an
errorspecific to that registry, allowing the caller to freely decide whether to abort or continue searching other registries; the multi-registry variant makes it difficult to do correct error reporting if 2 out of the 5 searched registries fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, changed to SearchRegistry, where only 1 registry is being searched.
docker/docker_client.go
Outdated
| logrus.Debugf("pinging v2 endpoint\n") | ||
| data, err := getRegistryData(ctx, sCtx, url, image, limit) | ||
| if err != nil { | ||
| fmt.Printf("couldn't search registry %q: %v", reg, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use logrus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed anymore. Fixed
docker/docker_client.go
Outdated
| } | ||
|
|
||
| // getREgistryData talks to either the v2 or v1 endpoint to get the results of the search query | ||
| func getRegistryData(ctx context.Context, sCtx *types.SystemContext, registry, image, limit string) (*Data, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(If this function continues to exist, the name should be somehow related to searching; getRegistryData could be pretty much anything.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this function. Fixed
docker/docker_client.go
Outdated
| logrus.Debugf("pinging v1 endpoint\n") | ||
| resp, err = client.makeRequest(ctx, "GET", "/v1/search?q="+image+"&n="+limit, nil, nil, true) | ||
| if err != nil || resp.StatusCode != http.StatusOK { | ||
| return nil, errors.Errorf("error getting response from endpoint") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should preserve the available data about the failure, i.e. either err or res.StatusCode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT the error, and non-200 status, is still thrown away. Sure, the error message will have to be ugly if both v1 and v2 fail, but for debugging it would be nice to return something a bit useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed. Logging the errors and response code now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, logging the two results is much clearer than trying to cram the data into an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it from logrus.Errorf to logrus.Debugf because the error was being logged always and I want it to only show if someone enables debugging. Hope that is acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The principal case which I’m thinking about here is the user mistyping a hostname, or the network being down (guessing with absolutely no evidence that these are the most likely kinds of failures); in that case it would be nice to tell the user something useful about the failure by default.
Still, a Debugf is acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so using a Debugf to log any errors occurring, but if both the v1 and v2 endpoint fail an error is returned. The user can then enable debugging for further information.
docker/docker_client.go
Outdated
| } | ||
| if isV1 { | ||
| // can talk to v1 registry if doing a search | ||
| if isV1 && !isSearch { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
… Ah, I see what’s going on: The currently relevant registries like docker.io and registry.access.redhat.com also/primarily serve content using the docker/distribution /v2/ API as well, so we never get here.
If we are happy enough to limit ourselves such V2 + /v1/search registries, the allowV1 boolean does not need to exist at all.
BTW the https://docs.docker.com/v1.4/reference/api/docker-io_api/ document you found says that only Authorization: Basic is supported, making both the ping and maybe all of setupRequestAuth seemingly unnecessary. OTOH https://github.com/docker/docker-registry/blob/master/docker_registry/toolkit.pyhttps://github.com/docker/docker-registry/blob/master/docker_registry/toolkit.py#L276L276 does something entirely different, an X-Signature header or Authorization: Token — and /v1/search does not require any authentication anyway.
Similarly /v1/search requires authentication on neither index.docker.io nor r.a.rh.com.
moby/moby seems to support both v2-like and v1-like authentication for the /v1/search URL. I guess we’ll eventually find out which ones, if any, matter; that doesn't need to block this PR.
docker/docker_client.go
Outdated
|
|
||
| results := []Results{} | ||
| for _, repo := range data.Repositories { | ||
| if strings.Contains(repo, image) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this respect limit? (I don’t have a strong opinion.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The limit is only for the v1 endpoint. I am changing the output in podman based on the number of results I get. Could change the variable to v1Limit, however the limit query doesn't always work. For example it doesn't work for the redhat registry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, we can let the caller deal with excess results. (Or we could truncate them in this function, for both v1 and v2; both would be consistent. But throwing away received data seems wasteful.)
65775e5 to
af5c4b6
Compare
docker/docker_client.go
Outdated
| return v1Res.Results, nil | ||
| } | ||
| defer resp.Body.Close() | ||
| logrus.Errorf("error getting search results from v2 endpoint %q, status code %q: %v", registry, resp.StatusCode, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/v2/v1/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
docker/docker_client.go
Outdated
| return nil, errors.Errorf("error getting response from endpoint") | ||
| } | ||
| } | ||
| defer resp.Body.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don’t think it’s correct. Now, with
resp, err := client.makeRequest(…)
if err == nil && resp.StatusCode == http.StatusOK {
…
return …
}
defer resp.Body.Close()resp.Bodyis not closed on theStatusOKpathdefer resp.Body.Closeis still called iferr != nil, and will crash iferr != nil && resp == nil.
This needs to be something like
resp, err := client.makeRequest(…)
if err != nil {
// Failed, log err
} else {
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
// Failed, log resp
} else {
// OK, process resp
}
}, however clumsy that looks; which is also why a makeGETRequest which hides the error handling might be worthwhile (not sure).
|
|
||
| if registry == "docker.io" { | ||
| registry = "index.docker.io" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Non-blocking: Consider using the existing dockerHostname constant, and adding a new one for the index.docker.io host name.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
|
||
| if registry == "docker.io" { | ||
| registry = "index.docker.io" | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This still does not give the original host name to the dockerCertDir lookup newDockerClientWithDetails. But it seems we don’t need to authenticate that way for docker.io, and reworking the code to do the mapping later seems pointlessly difficult right now, so I guess let’s leave it as it is. The API caller has a simple interface, and that’s the important thing.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This also causes us to use the V2 API against the V1 host name. Again, it works, and doing it right is more difficult. Might be worth at least a comment noting that this is not strictly following the API, and that this is for simplicity of the implementation [and not a workaround for some obscure bug which needs to be left unmodified in the future].)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comment.
docker/docker_client.go
Outdated
| return v1Res.Results, nil | ||
| } | ||
| defer resp.Body.Close() | ||
| logrus.Debugf("error getting search results from v2 endpoint %q, status code %q: %v", registry, resp.StatusCode, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/v2/v1/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
docker/docker_client.go
Outdated
| if err != nil { | ||
| return nil, errors.Wrapf(err, "error making request") | ||
| } | ||
| defer resp.Body.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work? The defer should execute when makeGETRequest returns, preventing further reads from resp.Body.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup it didn't work, my bad. I decided to do it the way you gave an example of with a bunch of if-else statements. It looks cluttered but that seems the best and easiest way of doing it for now.
docker/docker_client.go
Outdated
| } | ||
| defer resp.Body.Close() | ||
| if resp.StatusCode != http.StatusOK { | ||
| return nil, errors.Errorf("not OK, status code %q", resp.StatusCode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it would make a difference.
| } else { | ||
| defer resp.Body.Close() | ||
| if resp.StatusCode != http.StatusOK { | ||
| logrus.Debugf("error getting search results from v1 endpoint %q, status code %q", registry, resp.StatusCode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going back/forth, should this be an Error rather than Debugf? Just wondering what the end user would see if this is hit and if they'd know what was up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I know they get the error below, just wondering if it would help to get the Status code there too.... Like I said, I'm on the fence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have it as a debug instead because the error was being printed out everytime this failed. And since it is possible for us to fail with the v2 endpoint and then move on to try the v1 endpoint, we don't want to fill the output with a bunch of error statements.
If both the v2 and v1 endpoints fail, an error is returned and then the user can enable debugging for further information on what is actually going wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the status code will be gibberish if the error is not nil, not sure though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the age old too little vs too much info problem. Thanks for the 411 and you've tipped me to the debug side. At some point in time it might be nice to get a blog together showing how you can use --debug to get more info if things are going south.
TomSweeneyRedHat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, one question for possible consideration and adjustment later if necessary.
|
@runcom PTAL |
podman search searches a registry for a matching image this adds the functionality to support that some registries respond to the v2 endpoint while others only respond to the v1 endpoint. This checks both endpoints for a result, and if none is given the user is informed. Signed-off-by: umohnani8 <[email protected]>
podman search searches a registry for a matching image
this adds the functionality to support that
some registries respond to the v2 endpoint while others
only respond to the v1 endpoint.
This checks both endpoints for a result, and if none is given
the user is informed.
Signed-off-by: umohnani8 [email protected]