Skip to content

Conversation

@umohnani8
Copy link
Member

podman search searches a registry for a matching image
this adds the functionality to support that
some registries respond to the v2 endpoint while others
only respond to the v1 endpoint.
This checks both endpoints for a result, and if none is given
the user is informed.

Signed-off-by: umohnani8 [email protected]

@umohnani8
Copy link
Member Author

@mtrmac @runcom PTAL

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very quick look; I didn’t really read SearchRegistries at all.

// detectProperties detects various properties of the registry.
// See the dockerClient documentation for members which are affected by this.
func (c *dockerClient) detectProperties(ctx context.Context) error {
func (c *dockerClient) detectProperties(ctx context.Context, isSearch bool) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a name for the parameter which describes what the parameter means/does, not what it is used for; allowV1 or something like that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

if resp.StatusCode != http.StatusOK {
return nil, nil, errors.Errorf("error getting response from endpoint")
}
if isV2 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is split, with the URL path hard-coded in the caller and the data format hard-coded in the callee, is weird. Maybe instead the front part of getRegistryData should become a “GET + check for StatusOK” helper?

Why do the V1 and V2 requests use a separate dockerClient anyway?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restructured it quite a bit. Should be fixed now.

func getRegistryData(ctx context.Context, sCtx *types.SystemContext, registry, path string, isV2 bool) (*V2Data, *V1Data, error) {
v2Data := &V2Data{}
v1Data := &V1Data{}
newLoginClient, err := newDockerClientWithDetails(sCtx, registry, "", "", "", nil, "")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“login” seems incorrect. Maybe just client, or even c?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for client

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

}
if isV1 {
// can talk to v1 registry if doing a search
if isV1 && !isSearch {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I’m being stupid but I can’t see how this works; here err != nil necessarily because we got into this branch, so this only changes what is the reported error but not that the method fails, i.e. newLoginClient.makeRequest should AFAICT fail for V1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I understand, detectPropertires() is pinging a v2 and a v1 endpoint to detect the api type of the registry. When v2 fails and it detects its a v1, it fails it as we didn't want to support v1. But with search, if it detects its a v1 it shouldn't fail.
I was failing before I added the condition to allowV1 if its the search call.
Maybe there is another way to handle it that I am missing.

Copy link
Collaborator

@mtrmac mtrmac Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

… Ah, I see what’s going on: The currently relevant registries like docker.io and registry.access.redhat.com also/primarily serve content using the docker/distribution /v2/ API as well, so we never get here.

If we are happy enough to limit ourselves such V2 + /v1/search registries, the allowV1 boolean does not need to exist at all.


BTW the https://docs.docker.com/v1.4/reference/api/docker-io_api/ document you found says that only Authorization: Basic is supported, making both the ping and maybe all of setupRequestAuth seemingly unnecessary. OTOH https://github.com/docker/docker-registry/blob/master/docker_registry/toolkit.pyhttps://github.com/docker/docker-registry/blob/master/docker_registry/toolkit.py#L276L276 does something entirely different, an X-Signature header or Authorization: Token — and /v1/search does not require any authentication anyway.

Similarly /v1/search requires authentication on neither index.docker.io nor r.a.rh.com.

moby/moby seems to support both v2-like and v1-like authentication for the /v1/search URL. I guess we’ll eventually find out which ones, if any, matter; that doesn't need to block this PR.

@umohnani8 umohnani8 force-pushed the search branch 2 times, most recently from 0f16d96 to 2d733a3 Compare January 19, 2018 18:09
}

// Data holds the output of both the v2 and v1 endpoints
type Data struct {
Copy link
Collaborator

@mtrmac mtrmac Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v1 and v2 should not share a type when the contents can/should never appear together in returned results. This is still quite confusing. Please just do the straightforward thing of implementing two independent requests in sequence; then figure out which parts can be shared / shortened.

Maybe (with functionality inline/in separate functions as appropriate for size/readability/repetitions):

type v1Results struct {…}
type v2Results struct {…}


for _, reg := range registries {
    client :=logrus.Debug("…v2…")
    body, err := c.makeGETRequest("…/v2/…")
    if err == nil {
       /* decode into a v2Results variable */
       searchResults[reg] = …
    } else {
        logrus.Debug("…v1…")
        body, err := c.makeGETRequest("…/v1/…")
        if err == nil {
          /* decode into a v1Results variable */
          searchResults[reg] = …
        } /* else $something */
    }

with a hypothetical dockerClient.makeGETRequest which does makeRequest+ the StatusOK check. Maybe that will turn out not to be worth it. Or, maybe, something like

func (c *dockerClient) makeGETJSONRequest(ctx context.Context, path string, dest interface{}) error

which does makeRequest, the StatusOK check, and json.Unmarshal; that could be used in a few existing cases as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed using the first method you suggested.

}

// Results holds the information of each matching image returned by the v1 endpoint
type Results struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a bit more specific name than docker.Results, maybe SearchResults.

And this is a single result, a member of an array of results, so it should be named in singular.

(Non-blocking: ”by the v1 endpoint“ seems incorrect as far as the public type of the return value of SearchRegistries goes (but noting that the type matches the v1 search result type exactly would be fine).)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

IsOfficial bool `json:"is_official"`
}

// SearchRegistries queries a list of registries for a matching image
Copy link
Collaborator

@mtrmac mtrmac Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should document a bit more what image means. Is it a substring? regex? something else? (Surprisingly, the Python implementation in https://github.com/docker/docker-registry/blob/master/docker_registry/lib/index/db.py#L156, but not r.a.rh.com, seems to accept a fragment of SQL LIKE syntax!! I guess that wasn’t intentional 😉 ) https://docs.docker.com/engine/reference/commandline/search/#parent-command at least says “a name containing …“.

Similarly for limit; and shouldn’t that be an integer?

And what are the keys/values in the returned map? Is the order of the members in the array significant?

Non-blocking because authoritative documentation may not be available. (See also below WRT searching a single registry vs. a list of them.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documented. Well limit needs to be a string in the query so I convert it before sending it over here. Can leave it as in integer if that is preferred.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the semantics of the limit field is an integer, it would be cleaner for the caller to pass an integer. Converting an integer value into a string is for SearchRegistries to worry about, just like URL-encoding the query and asking over HTTP are things internal to SearchRegistries and not exposed to callers.

s/number of queries/number of results/ perhaps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

for _, reg := range registries {
url := reg
if strings.Contains(reg, "docker") {
url = "index.docker.io"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition would match local-docker-mirror.example.com as well; this needs to be an exact match.

Also, the registry value passed to newDockerClientWithDetails is used for looking up certificates, and with newDockerClientForRef also for looking up usernames/passwords. So the mapping to index.docker.io needs to happen after these lookups, perhaps somewhere around the dockerHostname/dockerRegistry special case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took this out of here and doing it in podman. User has to send the actual registry url for the search to happen. Fixed.

Copy link
Collaborator

@mtrmac mtrmac Jan 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does mapping the hostname in podman work with the certificate and username/password lookup?

(I guess it doesn’t, strictly speaking, matter, because we assume the search/catalog endpoints to be completely unauthenticated… still, it seems unclean.)

url = "index.docker.io"
}

logrus.Debugf("pinging v2 endpoint\n")
Copy link
Collaborator

@mtrmac mtrmac Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Nit: This seems left over, the same message is logged inside getRegistryData.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

// SearchRegistries queries a list of registries for a matching image
func SearchRegistries(ctx context.Context, sCtx *types.SystemContext, registries []string, image, limit string) (map[string][]Results, error) {
searchResults := make(map[string][]Results)
for _, reg := range registries {
Copy link
Collaborator

@mtrmac mtrmac Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a benefit to having a registries array, and returning a map of results? It doesn’t hurt, really, but this could just as well be a SearchRegistry(… registry string …) ([]Results, error), with the loop in the caller, and it seems to me that this might be simpler for both the caller and the callee.

AFAICT the caller needs to make a loop through registries, or through the keys of the returned map, anyway, just to process the results, and:

  • the single-registry-seach variant would mean that the caller wouldn’t have to worry about result[registry] being unset
  • it could also return an error specific to that registry, allowing the caller to freely decide whether to abort or continue searching other registries; the multi-registry variant makes it difficult to do correct error reporting if 2 out of the 5 searched registries fail.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, changed to SearchRegistry, where only 1 registry is being searched.

logrus.Debugf("pinging v2 endpoint\n")
data, err := getRegistryData(ctx, sCtx, url, image, limit)
if err != nil {
fmt.Printf("couldn't search registry %q: %v", reg, err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use logrus.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed anymore. Fixed

}

// getREgistryData talks to either the v2 or v1 endpoint to get the results of the search query
func getRegistryData(ctx context.Context, sCtx *types.SystemContext, registry, image, limit string) (*Data, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(If this function continues to exist, the name should be somehow related to searching; getRegistryData could be pretty much anything.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this function. Fixed

logrus.Debugf("pinging v1 endpoint\n")
resp, err = client.makeRequest(ctx, "GET", "/v1/search?q="+image+"&n="+limit, nil, nil, true)
if err != nil || resp.StatusCode != http.StatusOK {
return nil, errors.Errorf("error getting response from endpoint")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should preserve the available data about the failure, i.e. either err or res.StatusCode.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT the error, and non-200 status, is still thrown away. Sure, the error message will have to be ugly if both v1 and v2 fail, but for debugging it would be nice to return something a bit useful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed. Logging the errors and response code now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, logging the two results is much clearer than trying to cram the data into an error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it from logrus.Errorf to logrus.Debugf because the error was being logged always and I want it to only show if someone enables debugging. Hope that is acceptable.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The principal case which I’m thinking about here is the user mistyping a hostname, or the network being down (guessing with absolutely no evidence that these are the most likely kinds of failures); in that case it would be nice to tell the user something useful about the failure by default.

Still, a Debugf is acceptable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so using a Debugf to log any errors occurring, but if both the v1 and v2 endpoint fail an error is returned. The user can then enable debugging for further information.

}
if isV1 {
// can talk to v1 registry if doing a search
if isV1 && !isSearch {
Copy link
Collaborator

@mtrmac mtrmac Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

… Ah, I see what’s going on: The currently relevant registries like docker.io and registry.access.redhat.com also/primarily serve content using the docker/distribution /v2/ API as well, so we never get here.

If we are happy enough to limit ourselves such V2 + /v1/search registries, the allowV1 boolean does not need to exist at all.


BTW the https://docs.docker.com/v1.4/reference/api/docker-io_api/ document you found says that only Authorization: Basic is supported, making both the ping and maybe all of setupRequestAuth seemingly unnecessary. OTOH https://github.com/docker/docker-registry/blob/master/docker_registry/toolkit.pyhttps://github.com/docker/docker-registry/blob/master/docker_registry/toolkit.py#L276L276 does something entirely different, an X-Signature header or Authorization: Token — and /v1/search does not require any authentication anyway.

Similarly /v1/search requires authentication on neither index.docker.io nor r.a.rh.com.

moby/moby seems to support both v2-like and v1-like authentication for the /v1/search URL. I guess we’ll eventually find out which ones, if any, matter; that doesn't need to block this PR.


results := []Results{}
for _, repo := range data.Repositories {
if strings.Contains(repo, image) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this respect limit? (I don’t have a strong opinion.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The limit is only for the v1 endpoint. I am changing the output in podman based on the number of results I get. Could change the variable to v1Limit, however the limit query doesn't always work. For example it doesn't work for the redhat registry.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, we can let the caller deal with excess results. (Or we could truncate them in this function, for both v1 and v2; both would be consistent. But throwing away received data seems wasteful.)

@umohnani8 umohnani8 force-pushed the search branch 4 times, most recently from 65775e5 to af5c4b6 Compare January 24, 2018 16:50
return v1Res.Results, nil
}
defer resp.Body.Close()
logrus.Errorf("error getting search results from v2 endpoint %q, status code %q: %v", registry, resp.StatusCode, err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/v2/v1/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

return nil, errors.Errorf("error getting response from endpoint")
}
}
defer resp.Body.Close()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don’t think it’s correct. Now, with

	resp, err := client.makeRequest(…)
	if err == nil && resp.StatusCode == http.StatusOK {
                …
		return …
	}
	defer resp.Body.Close()
  1. resp.Body is not closed on the StatusOK path
  2. defer resp.Body.Close is still called if err != nil, and will crash if err != nil && resp == nil.

This needs to be something like

resp, err := client.makeRequest(…)
if err != nil {
    // Failed, log err
} else {
    defer resp.Body.Close()
    if resp.StatusCode != http.StatusOK {
        // Failed, log resp
    } else {
        // OK, process resp
    }
}

, however clumsy that looks; which is also why a makeGETRequest which hides the error handling might be worthwhile (not sure).


if registry == "docker.io" {
registry = "index.docker.io"
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Non-blocking: Consider using the existing dockerHostname constant, and adding a new one for the index.docker.io host name.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


if registry == "docker.io" {
registry = "index.docker.io"
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This still does not give the original host name to the dockerCertDir lookup newDockerClientWithDetails. But it seems we don’t need to authenticate that way for docker.io, and reworking the code to do the mapping later seems pointlessly difficult right now, so I guess let’s leave it as it is. The API caller has a simple interface, and that’s the important thing.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This also causes us to use the V2 API against the V1 host name. Again, it works, and doing it right is more difficult. Might be worth at least a comment noting that this is not strictly following the API, and that this is for simplicity of the implementation [and not a workaround for some obscure bug which needs to be left unmodified in the future].)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment.

return v1Res.Results, nil
}
defer resp.Body.Close()
logrus.Debugf("error getting search results from v2 endpoint %q, status code %q: %v", registry, resp.StatusCode, err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/v2/v1/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

if err != nil {
return nil, errors.Wrapf(err, "error making request")
}
defer resp.Body.Close()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work? The defer should execute when makeGETRequest returns, preventing further reads from resp.Body.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup it didn't work, my bad. I decided to do it the way you gave an example of with a bunch of if-else statements. It looks cluttered but that seems the best and easiest way of doing it for now.

}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
return nil, errors.Errorf("not OK, status code %q", resp.StatusCode)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Would this be better with client.HandleErrorResponse? Maybe it doesn’t matter, but see #390 / #409 for other users who care.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it would make a difference.

@mtrmac
Copy link
Collaborator

mtrmac commented Jan 26, 2018

👍

Thanks!

Approved with PullApprove

} else {
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
logrus.Debugf("error getting search results from v1 endpoint %q, status code %q", registry, resp.StatusCode)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going back/forth, should this be an Error rather than Debugf? Just wondering what the end user would see if this is hit and if they'd know what was up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I know they get the error below, just wondering if it would help to get the Status code there too.... Like I said, I'm on the fence.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have it as a debug instead because the error was being printed out everytime this failed. And since it is possible for us to fail with the v2 endpoint and then move on to try the v1 endpoint, we don't want to fill the output with a bunch of error statements.
If both the v2 and v1 endpoints fail, an error is returned and then the user can enable debugging for further information on what is actually going wrong.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the status code will be gibberish if the error is not nil, not sure though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the age old too little vs too much info problem. Thanks for the 411 and you've tipped me to the debug side. At some point in time it might be nice to get a blog together showing how you can use --debug to get more info if things are going south.

Copy link
Member

@TomSweeneyRedHat TomSweeneyRedHat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one question for possible consideration and adjustment later if necessary.

@umohnani8
Copy link
Member Author

@runcom PTAL

podman search searches a registry for a matching image
this adds the functionality to support that
some registries respond to the v2 endpoint while others
only respond to the v1 endpoint.
This checks both endpoints for a result, and if none is given
the user is informed.

Signed-off-by: umohnani8 <[email protected]>
@runcom
Copy link
Member

runcom commented Feb 5, 2018

LGTM

Approved with PullApprove

@runcom runcom merged commit 2524e50 into containers:master Feb 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants