Add functionality to search registries #406

umohnani8 · 2018-01-18T17:46:55Z

podman search searches a registry for a matching image
this adds the functionality to support that
some registries respond to the v2 endpoint while others
only respond to the v1 endpoint.
This checks both endpoints for a result, and if none is given
the user is informed.

Signed-off-by: umohnani8 [email protected]

umohnani8 · 2018-01-18T17:59:41Z

@mtrmac @runcom PTAL

mtrmac

A very quick look; I didn’t really read SearchRegistries at all.

mtrmac · 2018-01-18T18:03:40Z

docker/docker_client.go

 // detectProperties detects various properties of the registry.
 // See the dockerClient documentation for members which are affected by this.
-func (c *dockerClient) detectProperties(ctx context.Context) error {
+func (c *dockerClient) detectProperties(ctx context.Context, isSearch bool) error {


Please use a name for the parameter which describes what the parameter means/does, not what it is used for; allowV1 or something like that.

mtrmac · 2018-01-18T18:08:21Z

docker/docker_client.go

+	if resp.StatusCode != http.StatusOK {
+		return nil, nil, errors.Errorf("error getting response from endpoint")
+	}
+	if isV2 {


The way this is split, with the URL path hard-coded in the caller and the data format hard-coded in the callee, is weird. Maybe instead the front part of getRegistryData should become a “GET + check for StatusOK” helper?

Why do the V1 and V2 requests use a separate dockerClient anyway?

restructured it quite a bit. Should be fixed now.

mtrmac · 2018-01-18T18:08:48Z

docker/docker_client.go

+func getRegistryData(ctx context.Context, sCtx *types.SystemContext, registry, path string, isV2 bool) (*V2Data, *V1Data, error) {
+	v2Data := &V2Data{}
+	v1Data := &V1Data{}
+	newLoginClient, err := newDockerClientWithDetails(sCtx, registry, "", "", "", nil, "")


“login” seems incorrect. Maybe just client, or even c?

+1 for client

mtrmac · 2018-01-18T18:12:40Z

docker/docker_client.go

 		}
-		if isV1 {
+		// can talk to v1 registry if doing a search
+		if isV1 && !isSearch {


Maybe I’m being stupid but I can’t see how this works; here err != nil necessarily because we got into this branch, so this only changes what is the reported error but not that the method fails, i.e. newLoginClient.makeRequest should AFAICT fail for V1.

From what I understand, detectPropertires() is pinging a v2 and a v1 endpoint to detect the api type of the registry. When v2 fails and it detects its a v1, it fails it as we didn't want to support v1. But with search, if it detects its a v1 it shouldn't fail.
I was failing before I added the condition to allowV1 if its the search call.
Maybe there is another way to handle it that I am missing.

… Ah, I see what’s going on: The currently relevant registries like docker.io and registry.access.redhat.com also/primarily serve content using the docker/distribution /v2/ API as well, so we never get here.

If we are happy enough to limit ourselves such V2 + /v1/search registries, the allowV1 boolean does not need to exist at all.

BTW the https://docs.docker.com/v1.4/reference/api/docker-io_api/ document you found says that only Authorization: Basic is supported, making both the ping and maybe all of setupRequestAuth seemingly unnecessary. OTOH https://github.com/docker/docker-registry/blob/master/docker_registry/toolkit.pyhttps://github.com/docker/docker-registry/blob/master/docker_registry/toolkit.py#L276L276 does something entirely different, an X-Signature header or Authorization: Token — and /v1/search does not require any authentication anyway.

Similarly /v1/search requires authentication on neither index.docker.io nor r.a.rh.com.

moby/moby seems to support both v2-like and v1-like authentication for the /v1/search URL. I guess we’ll eventually find out which ones, if any, matter; that doesn't need to block this PR.

mtrmac · 2018-01-20T01:11:41Z

docker/docker_client.go

 }

+// Data holds the output of both the v2 and v1 endpoints
+type Data struct {


v1 and v2 should not share a type when the contents can/should never appear together in returned results. This is still quite confusing. Please just do the straightforward thing of implementing two independent requests in sequence; then figure out which parts can be shared / shortened.

Maybe (with functionality inline/in separate functions as appropriate for size/readability/repetitions):

type v1Results struct {…} type v2Results struct {…} for _, reg := range registries { client := … logrus.Debug("…v2…") body, err := c.makeGETRequest("…/v2/…") if err == nil { /* decode into a v2Results variable */ searchResults[reg] = … } else { logrus.Debug("…v1…") body, err := c.makeGETRequest("…/v1/…") if err == nil { /* decode into a v1Results variable */ searchResults[reg] = … } /* else $something */ }

with a hypothetical dockerClient.makeGETRequest which does makeRequest+ the StatusOK check. Maybe that will turn out not to be worth it. Or, maybe, something like

func (c *dockerClient) makeGETJSONRequest(ctx context.Context, path string, dest interface{}) error

which does makeRequest, the StatusOK check, and json.Unmarshal; that could be used in a few existing cases as well.

Fixed using the first method you suggested.

mtrmac · 2018-01-20T01:14:31Z

docker/docker_client.go

+}
+
+// Results holds the information of each matching image returned by the v1 endpoint
+type Results struct {


Please use a bit more specific name than docker.Results, maybe SearchResults.

And this is a single result, a member of an array of results, so it should be named in singular.

(Non-blocking: ”by the v1 endpoint“ seems incorrect as far as the public type of the return value of SearchRegistries goes (but noting that the type matches the v1 search result type exactly would be fine).)

mtrmac · 2018-01-20T01:16:05Z

docker/docker_client.go

+	IsOfficial  bool   `json:"is_official"`
+}
+
+// SearchRegistries queries a list of registries for a matching image


This should document a bit more what image means. Is it a substring? regex? something else? (Surprisingly, the Python implementation in https://github.com/docker/docker-registry/blob/master/docker_registry/lib/index/db.py#L156, but not r.a.rh.com, seems to accept a fragment of SQL LIKE syntax!! I guess that wasn’t intentional 😉 ) https://docs.docker.com/engine/reference/commandline/search/#parent-command at least says “a name containing …“.

Similarly for limit; and shouldn’t that be an integer?

And what are the keys/values in the returned map? Is the order of the members in the array significant?

Non-blocking because authoritative documentation may not be available. (See also below WRT searching a single registry vs. a list of them.)

Documented. Well limit needs to be a string in the query so I convert it before sending it over here. Can leave it as in integer if that is preferred.

If the semantics of the limit field is an integer, it would be cleaner for the caller to pass an integer. Converting an integer value into a string is for SearchRegistries to worry about, just like URL-encoding the query and asking over HTTP are things internal to SearchRegistries and not exposed to callers.

s/number of queries/number of results/ perhaps.

mtrmac · 2018-01-20T01:21:30Z

docker/docker_client.go

+	for _, reg := range registries {
+		url := reg
+		if strings.Contains(reg, "docker") {
+			url = "index.docker.io"


The condition would match local-docker-mirror.example.com as well; this needs to be an exact match.

Also, the registry value passed to newDockerClientWithDetails is used for looking up certificates, and with newDockerClientForRef also for looking up usernames/passwords. So the mapping to index.docker.io needs to happen after these lookups, perhaps somewhere around the dockerHostname/dockerRegistry special case.

Took this out of here and doing it in podman. User has to send the actual registry url for the search to happen. Fixed.

How does mapping the hostname in podman work with the certificate and username/password lookup?

(I guess it doesn’t, strictly speaking, matter, because we assume the search/catalog endpoints to be completely unauthenticated… still, it seems unclean.)

mtrmac · 2018-01-20T01:22:09Z

docker/docker_client.go

+			url = "index.docker.io"
+		}
+
+		logrus.Debugf("pinging v2 endpoint\n")


(Nit: This seems left over, the same message is logged inside getRegistryData.)

mtrmac · 2018-01-20T01:57:08Z

docker/docker_client.go

+// SearchRegistries queries a list of registries for a matching image
+func SearchRegistries(ctx context.Context, sCtx *types.SystemContext, registries []string, image, limit string) (map[string][]Results, error) {
+	searchResults := make(map[string][]Results)
+	for _, reg := range registries {


Is there a benefit to having a registries array, and returning a map of results? It doesn’t hurt, really, but this could just as well be a SearchRegistry(… registry string …) ([]Results, error), with the loop in the caller, and it seems to me that this might be simpler for both the caller and the callee.

AFAICT the caller needs to make a loop through registries, or through the keys of the returned map, anyway, just to process the results, and:

the single-registry-seach variant would mean that the caller wouldn’t have to worry about result[registry] being unset

it could also return an error specific to that registry, allowing the caller to freely decide whether to abort or continue searching other registries; the multi-registry variant makes it difficult to do correct error reporting if 2 out of the 5 searched registries fail.

Fixed, changed to SearchRegistry, where only 1 registry is being searched.

mtrmac · 2018-01-20T02:15:52Z

docker/docker_client.go

+		logrus.Debugf("pinging v2 endpoint\n")
+		data, err := getRegistryData(ctx, sCtx, url, image, limit)
+		if err != nil {
+			fmt.Printf("couldn't search registry %q: %v", reg, err)


Please use logrus.

Not needed anymore. Fixed

mtrmac · 2018-01-20T02:17:59Z

docker/docker_client.go

+}
+
+// getREgistryData talks to either the v2 or v1 endpoint to get the results of the search query
+func getRegistryData(ctx context.Context, sCtx *types.SystemContext, registry, image, limit string) (*Data, error) {


(If this function continues to exist, the name should be somehow related to searching; getRegistryData could be pretty much anything.)

Removed this function. Fixed

mtrmac · 2018-01-20T02:19:03Z

docker/docker_client.go

+		logrus.Debugf("pinging v1 endpoint\n")
+		resp, err = client.makeRequest(ctx, "GET", "/v1/search?q="+image+"&n="+limit, nil, nil, true)
+		if err != nil || resp.StatusCode != http.StatusOK {
+			return nil, errors.Errorf("error getting response from endpoint")


This should preserve the available data about the failure, i.e. either err or res.StatusCode.

AFAICT the error, and non-200 status, is still thrown away. Sure, the error message will have to be ugly if both v1 and v2 fail, but for debugging it would be nice to return something a bit useful.

fixed. Logging the errors and response code now.

Good idea, logging the two results is much clearer than trying to cram the data into an error.

I changed it from logrus.Errorf to logrus.Debugf because the error was being logged always and I want it to only show if someone enables debugging. Hope that is acceptable.

The principal case which I’m thinking about here is the user mistyping a hostname, or the network being down (guessing with absolutely no evidence that these are the most likely kinds of failures); in that case it would be nice to tell the user something useful about the failure by default.

Still, a Debugf is acceptable.

Right, so using a Debugf to log any errors occurring, but if both the v1 and v2 endpoint fail an error is returned. The user can then enable debugging for further information.

mtrmac · 2018-01-20T02:41:47Z

docker/docker_client.go

 		}
-		if isV1 {
+		// can talk to v1 registry if doing a search
+		if isV1 && !isSearch {


… Ah, I see what’s going on: The currently relevant registries like docker.io and registry.access.redhat.com also/primarily serve content using the docker/distribution /v2/ API as well, so we never get here.

If we are happy enough to limit ourselves such V2 + /v1/search registries, the allowV1 boolean does not need to exist at all.

BTW the https://docs.docker.com/v1.4/reference/api/docker-io_api/ document you found says that only Authorization: Basic is supported, making both the ping and maybe all of setupRequestAuth seemingly unnecessary. OTOH https://github.com/docker/docker-registry/blob/master/docker_registry/toolkit.pyhttps://github.com/docker/docker-registry/blob/master/docker_registry/toolkit.py#L276L276 does something entirely different, an X-Signature header or Authorization: Token — and /v1/search does not require any authentication anyway.

Similarly /v1/search requires authentication on neither index.docker.io nor r.a.rh.com.

moby/moby seems to support both v2-like and v1-like authentication for the /v1/search URL. I guess we’ll eventually find out which ones, if any, matter; that doesn't need to block this PR.

mtrmac · 2018-01-20T02:58:04Z

docker/docker_client.go

+
+		results := []Results{}
+		for _, repo := range data.Repositories {
+			if strings.Contains(repo, image) {


Should this respect limit? (I don’t have a strong opinion.)

The limit is only for the v1 endpoint. I am changing the output in podman based on the number of results I get. Could change the variable to v1Limit, however the limit query doesn't always work. For example it doesn't work for the redhat registry.

Fair enough, we can let the caller deal with excess results. (Or we could truncate them in this function, for both v1 and v2; both would be consistent. But throwing away received data seems wasteful.)

mtrmac · 2018-01-24T16:44:28Z

docker/docker_client.go

+		return v1Res.Results, nil
+	}
+	defer resp.Body.Close()
+	logrus.Errorf("error getting search results from v2 endpoint %q, status code %q: %v", registry, resp.StatusCode, err)


mtrmac · 2018-01-24T16:51:32Z

docker/docker_client.go

+			return nil, errors.Errorf("error getting response from endpoint")
+		}
+	}
+	defer resp.Body.Close()


I still don’t think it’s correct. Now, with

resp, err := client.makeRequest(…) if err == nil && resp.StatusCode == http.StatusOK { … return … } defer resp.Body.Close()

resp.Body is not closed on the StatusOK path

defer resp.Body.Close is still called if err != nil, and will crash if err != nil && resp == nil.

This needs to be something like

resp, err := client.makeRequest(…) if err != nil { // Failed, log err } else { defer resp.Body.Close() if resp.StatusCode != http.StatusOK { // Failed, log resp } else { // OK, process resp } }

, however clumsy that looks; which is also why a makeGETRequest which hides the error handling might be worthwhile (not sure).

mtrmac · 2018-01-24T16:59:37Z

docker/docker_client.go

+
+	if registry == "docker.io" {
+		registry = "index.docker.io"
+	}


(Non-blocking: Consider using the existing dockerHostname constant, and adding a new one for the index.docker.io host name.)

mtrmac · 2018-01-24T17:02:00Z

docker/docker_client.go

+
+	if registry == "docker.io" {
+		registry = "index.docker.io"
+	}


(This still does not give the original host name to the dockerCertDir lookup newDockerClientWithDetails. But it seems we don’t need to authenticate that way for docker.io, and reworking the code to do the mapping later seems pointlessly difficult right now, so I guess let’s leave it as it is. The API caller has a simple interface, and that’s the important thing.)

(This also causes us to use the V2 API against the V1 host name. Again, it works, and doing it right is more difficult. Might be worth at least a comment noting that this is not strictly following the API, and that this is for simplicity of the implementation [and not a workaround for some obscure bug which needs to be left unmodified in the future].)

Added comment.

mtrmac · 2018-01-24T17:03:15Z

docker/docker_client.go

+		return v1Res.Results, nil
+	}
+	defer resp.Body.Close()
+	logrus.Debugf("error getting search results from v2 endpoint %q, status code %q: %v", registry, resp.StatusCode, err)


mtrmac · 2018-01-24T17:46:04Z

docker/docker_client.go

+	if err != nil {
+		return nil, errors.Wrapf(err, "error making request")
+	}
+	defer resp.Body.Close()


Does this work? The defer should execute when makeGETRequest returns, preventing further reads from resp.Body.

yup it didn't work, my bad. I decided to do it the way you gave an example of with a bunch of if-else statements. It looks cluttered but that seems the best and easiest way of doing it for now.

mtrmac · 2018-01-24T17:58:37Z

docker/docker_client.go

+	}
+	defer resp.Body.Close()
+	if resp.StatusCode != http.StatusOK {
+		return nil, errors.Errorf("not OK, status code %q", resp.StatusCode)


(Would this be better with client.HandleErrorResponse? Maybe it doesn’t matter, but see #390 / #409 for other users who care.)

I don't think it would make a difference.

mtrmac · 2018-01-26T10:17:34Z

👍

Thanks!

TomSweeneyRedHat · 2018-01-26T15:30:44Z

docker/docker_client.go

+	} else {
+		defer resp.Body.Close()
+		if resp.StatusCode != http.StatusOK {
+			logrus.Debugf("error getting search results from v1 endpoint %q, status code %q", registry, resp.StatusCode)


I'm going back/forth, should this be an Error rather than Debugf? Just wondering what the end user would see if this is hit and if they'd know what was up.

FWIW, I know they get the error below, just wondering if it would help to get the Status code there too.... Like I said, I'm on the fence.

I have it as a debug instead because the error was being printed out everytime this failed. And since it is possible for us to fail with the v2 endpoint and then move on to try the v1 endpoint, we don't want to fill the output with a bunch of error statements.
If both the v2 and v1 endpoints fail, an error is returned and then the user can enable debugging for further information on what is actually going wrong.

I think the status code will be gibberish if the error is not nil, not sure though.

Ah, the age old too little vs too much info problem. Thanks for the 411 and you've tipped me to the debug side. At some point in time it might be nice to get a blog together showing how you can use --debug to get more info if things are going south.

TomSweeneyRedHat

LGTM, one question for possible consideration and adjustment later if necessary.

umohnani8 · 2018-01-26T18:01:50Z

@runcom PTAL

podman search searches a registry for a matching image this adds the functionality to support that some registries respond to the v2 endpoint while others only respond to the v1 endpoint. This checks both endpoints for a result, and if none is given the user is informed. Signed-off-by: umohnani8 <[email protected]>

runcom · 2018-02-05T17:12:16Z

LGTM

umohnani8 mentioned this pull request Jan 18, 2018

Add podman search command containers/podman#241

Closed

mtrmac reviewed Jan 18, 2018

View reviewed changes

umohnani8 force-pushed the search branch 2 times, most recently from 0f16d96 to 2d733a3 Compare January 19, 2018 18:09

mtrmac reviewed Jan 20, 2018

View reviewed changes

umohnani8 force-pushed the search branch 4 times, most recently from 65775e5 to af5c4b6 Compare January 24, 2018 16:50

mtrmac reviewed Jan 24, 2018

View reviewed changes

umohnani8 force-pushed the search branch from af5c4b6 to 66af018 Compare January 24, 2018 17:33

mtrmac reviewed Jan 24, 2018

View reviewed changes

umohnani8 force-pushed the search branch from 66af018 to a70a438 Compare January 24, 2018 20:14

TomSweeneyRedHat reviewed Jan 26, 2018

View reviewed changes

umohnani8 force-pushed the search branch from a70a438 to abca35a Compare February 1, 2018 15:19

runcom merged commit 2524e50 into containers:master Feb 5, 2018

Add functionality to search registries #406

Add functionality to search registries #406

Uh oh!

Conversation

umohnani8 commented Jan 18, 2018

Uh oh!

umohnani8 commented Jan 18, 2018

Uh oh!

mtrmac left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mtrmac Jan 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mtrmac Jan 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mtrmac Jan 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mtrmac Jan 23, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mtrmac Jan 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mtrmac Jan 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

mtrmac Jan 20, 2018 •

edited

Loading

mtrmac Jan 20, 2018 •

edited

Loading

mtrmac Jan 20, 2018 •

edited

Loading

mtrmac Jan 23, 2018 •

edited

Loading

mtrmac Jan 20, 2018 •

edited

Loading

mtrmac Jan 20, 2018 •

edited

Loading

mtrmac Jan 20, 2018 •

edited

Loading