Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

golang:1.12-alpine does not have layers for linux/arm64/v8 #269

Closed
moikot opened this issue Mar 14, 2019 · 24 comments
Closed

golang:1.12-alpine does not have layers for linux/arm64/v8 #269

moikot opened this issue Mar 14, 2019 · 24 comments

Comments

@moikot
Copy link

moikot commented Mar 14, 2019

The result of running

docker run --rm weshigbee/manifest-tool inspect golang:1.12-alpine
...
4    Mfst Type: application/vnd.docker.distribution.manifest.v1+json
4       Digest: sha256:7cf1f7ccf392bd834eb91f02892f48992d3c2ba292c2198315a4637bb9454c30
4  Mfst Length: 6981
4     Platform:
4           -      OS: linux
4           -    Arch: arm64
4           - Variant: v8
4           - Feature:
4     # Layers: 0
...

Please note that the number of layers is zero. The same for golang:1.11-alpine

@wglambert
Copy link

Looks like an instance of docker-library/official-images#3835

The manifest is updated now

$ docker pull arm64v8/golang:1.12-alpine
1.12-alpine: Pulling from arm64v8/golang
3b00a3925ee4: Already exists 
7809c1a4c8e2: Pull complete 
8c00b1d46f44: Pull complete 
955cc90a48f7: Pull complete 
72f16051d572: Pull complete 
Digest: sha256:05f1d1242721f0042550fcc84f35cbe87f39ef5e6f75852d0608b92f4a2d1878
Status: Downloaded newer image for arm64v8/golang:1.12-alpine

@moikot
Copy link
Author

moikot commented Mar 14, 2019

The interesting thing is that I could pull 1.12 and 1.11 specifying arm64v8 prefix without a problem. But I can't do my builds using buildkit because it totally relies on the published manifests.
I've just checked again, no layers for 1.12 or 1.11 for arm64 😞 (1.10 is OK)

Here is the buildkit's error:

error: failed to solve: rpc error: code = Unknown desc = failed to copy: httpReaderSeeker: failed open: could not fetch content descriptor sha256:7cf1f7ccf392bd834eb91f02892f48992d3c2ba292c2198315a4637bb9454c30 (application/vnd.docker.distribution.manifest.v1+json) from remote: not found

@yosifkit
Copy link
Member

yosifkit commented Mar 14, 2019

After much digging. The manifest is technically fine. The item in question is a manifest v1:

4 Mfst Type: application/vnd.docker.distribution.manifest.v1+json

And works fine for docker pull. We have no control over what the hub gives us back when we push an image. The Hub gave us "500 Internal Server Error" twice while trying to push the alpine images of golang on the 7th of March. Doing an inspect on the pushed artifacts in arm64v8/golang shows that some pushed as v2 and some as v1. This makes me think that the Docker Hub was having issues and our docker client "helpfully" downgraded us to a v1 push by assuming it was an old registry.

cc @tianon, perhaps we need to somehow disable v1 pushing on our build jobs (this won't fix this now, but will prevent it in the future). edit: this would probably require custom patches to docker itself 😞

@tianon
Copy link
Member

tianon commented Mar 14, 2019

This is partly a bug in manifest-tool, so I've filed that: estesp/manifest-tool#74

I'm really not very keen on patching Docker for our builders; that's kind of heinous. 😞 😱

@tianon
Copy link
Member

tianon commented Mar 14, 2019

(Especially given that this was a blip in the Hub that caused these to be pushed in the first place, and docker pull can handle them just fine.)

@tonistiigi
Copy link

Could someone fix the current version of that golang:1.12-alpine image please. No official image created in the last 3 years should ever be schema1.

cc @dmcgowan For possible docker push side validation for this.

@tianon
Copy link
Member

tianon commented Mar 16, 2019

Is it still? There was a bump since then that should've pushed a new image that isn't schema1.

@tonistiigi
Copy link

@tianon Yes, golang:1.12-alpine is ok now. golang:1.12.0-alpine still produces the issue. So it's not as critical anymore(not sure if you usually would bump the previous versions as well in these cases). We should still find a solution where v1 variants could never happen though, even if some error appears in the process.

@tianon
Copy link
Member

tianon commented Mar 16, 2019

I'm kind of surprised the Hub still allowed a schema1 push 😅

@tonistiigi
Copy link

fwiw it seems likely to me that the configuration of the manifest list pointing to a v1 manifest will never be supported by containerd. Tracked in containerd/containerd#3100

@thaJeztah
Copy link
Contributor

I'm kind of surprised the Hub still allowed a schema1 push 😅

@cowsrule any ideas? ^^

@cowsrule
Copy link

Also surprised, we should be blocking all v1 pushes, will followup internally. we also published this today: https://engineering.docker.com/2019/03/registry-v1-api-deprecation/

@tonistiigi
Copy link

@cowsrule The issue here is in v2/schema1 image not actual v1

@wglambert
Copy link

Closing with estesp/manifest-tool#75

@jonjohnsonjr
Copy link

I have seen this failure for several other images, so I don't think it's quite fixed? @thaJeztah @tianon

For example, nginx:1.19.0-alpine:

$ crane manifest nginx:1.19.0-alpine | jq . | grep v1 -C 5
      },
      "size": 1360
    },
    {
      "digest": "sha256:eff196a3849ad6541fd3afe676113896be214753740e567575bb562986bd2cd4",
      "mediaType": "application/vnd.docker.distribution.manifest.v1+json",
      "platform": {
        "architecture": "arm64",
        "os": "linux",
        "variant": "v8"
      },

This looks like a down-converted schema 1 image that was built 2020-06-02:

$ crane manifest nginx@sha256:eff196a3849ad6541fd3afe676113896be214753740e567575bb562986bd2cd4 | jq .history[0].v1Compatibility -r | jq .created
"2020-06-02T16:44:38.301452064Z"

I was thinking about just scanning the official images to look for all manifest lists that reference schema 1 images, but I am pretty sure I would hit rate limits pretty immediately 😄 -- I'm sure this would be an easier query for a docker hub maintainer than to discover via the registry API, if someone is interested in enumerating and fixing these (can we backfill?).

I'm kind of surprised docker hub accepts this, but I understand this is a valid manifest list per the spec. I'm more surprised docker hub is still accepting schema 1 image uploads.

This is tangentially related to opencontainers/distribution-spec#212 (comment) @justincormack

@thaJeztah
Copy link
Contributor

I believe the official NGINX image is now maintained by NGINX, inc. (@tianon probably knows). Not at my computer right now, but wondering if they use some different build system to build the images that produces these 🤔

@tianon
Copy link
Member

tianon commented Nov 18, 2020

No official images maintainer builds/pushes their own images -- all official images are built with very stock docker build + docker push, so this has to be the result of a similar hiccup as what happened previously in this thread (either Hub downgrading the manifest to schema1, or more likely, Hub having a hiccup and docker push deciding to push schema1 and the Hub accepting it 😞).

@jonjohnsonjr
Copy link

jonjohnsonjr commented Nov 18, 2020

the official NGINX image ...

Here's another non-nginx image:

$ crane manifest rabbitmq:3.7.12-alpine | jq . | grep v1 -C 6
      "platform": {
        "architecture": "386",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.docker.distribution.manifest.v1+json",
      "size": 20767,
      "digest": "sha256:d28eecb03b78d2335f7c276d806a77b878684aef8ba2af10d116d7fc860c2ced",
      "platform": {
        "architecture": "ppc64le",
        "os": "linux"
      }
$ crane manifest rabbitmq:3.7.12-alpine@sha256:d28eecb03b78d2335f7c276d806a77b878684aef8ba2af10d116d7fc860c2ced | jq .history[0].v1Compatibility -r | jq .created
"2019-03-08T07:32:22.327078372Z"

Edit: I see this was from the timeframe this was fixed. What's the policy on updating old images?

It would be great if docker or manifest-tool had a flag to disable pulling/pushing of schema 1 images (just check Content-Type header and abort if it's schema 1) so that this would fail loudly instead of silently pushing these manifest lists like this.

@thaJeztah
Copy link
Contributor

thaJeztah commented Nov 18, 2020

What's the policy on updating old images?

I don't think old images are updated; they're mostly kept as an archive of older versions (and not sure if it's worth the effort to update them).

I think overall the problem being discussed in opencontainers/distribution-spec#212 (comment) is the automatic conversion of content based on Accept headers; if that would be disabled, it would "simply" mean serving all content as-is.

(thinking out loud) The only "problem" with that could be that an image that could previously be pulled as a schema 2 v1 image (through automatic conversion) would now no longer be available in that format, and only be available as a schema 2 v2 image, however any current client should be able to use schema 2 v2 images. Of course, a pre-announcement would be needed, and some time-window for users to make sure they're running "current" versions (between big quotes, as "current" means; upgraded in the last 4-5 years)

edit: 4-5 years; "off by one"

@tianon
Copy link
Member

tianon commented Nov 18, 2020

So I guess more concretely, should Hub be rejecting pushes of v2/schema1 images now?
(given it's deprecated in the client, and only used as a fallback, and appears to happen sometimes accidentally when there's an issue with pushing?)

@jonjohnsonjr
Copy link

Of course, a pre-announcement would be needed, and some time-window for users to make sure they're running "current" versions (between big quotes, as "current" means; upgraded in the last 4-5 years)

Absolutely. It would be great to get the ball rolling on this soon-ish if we all agree it needs to happen eventually.

So I guess more concretely, should Hub be rejecting pushes of v2/schema1 images now?

Rejecting schema 1 pushes is a good start. We see a very small number of schema 1 pushes. I expect they might be unintentional due to bugs like this. Docker put out a deprecation a while ago, and I'm curious if there's a process in place already or a plan or what the next steps are.

Target For Removal In Release: v20.10

So I'm guessing newer clients will stop being able to push and pull schema 1 images, but will dockerhub ever stop doing the down-conversion?

I'm trying to decide if it's worth the effort to fix a client that doesn't support manifest lists that reference schema 1 images or if I should just wait because the problem will eventually go away.

@thaJeztah
Copy link
Contributor

@jonjohnsonjr to help getting the ball rolling, could you open a ticket in our roadmap? https://github.com/docker/roadmap/issues

I think there's a review session tomorrow for the roadmap, and I can bring it up

@jonjohnsonjr
Copy link

Sure! Done: docker/roadmap#173

@thaJeztah
Copy link
Contributor

Thank you!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants