Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Explore translating latest to a digest and possibly then other tags to improve reproducibility #3707

Closed
consideRatio opened this issue Feb 7, 2024 · 4 comments
Assignees

Comments

@consideRatio
Copy link
Contributor

consideRatio commented Feb 7, 2024

latest tags are practical but using them comes with a key compromise: you won't know what image version you ended up using. Maybe the latest tag at the time it was loaded the same as the tag 2024-02-07, but you wouldn't know.

If we could get a started user server know the exact image started via a image digest, we may be able to improve reproducibility.

Tech reading

@consideRatio
Copy link
Contributor Author

Using kubectl to get digest of a container

When a container is started, its possible to detect from the k8s Pod's containerStatuses what image is referenced and what image was in the end used via a image digest:

  # ...
  containerStatuses:
  - containerID: containerd://8c82914618e37233be782768182cb322877ea61a5f21c440467d1ff18ea3005e
    image: quay.io/jupyterhub/configurable-http-proxy:4.6.1
    imageID: quay.io/jupyterhub/configurable-http-proxy@sha256:fd916f75415f1e7e813c5a18b34a6042a601604938ff8777b044447efb3bd819
  # ...

@consideRatio
Copy link
Contributor Author

consideRatio commented Feb 7, 2024

Exposing imageID to the running container - no native fix available yet

It is not possible to use the "Downward API" to get the container's imageID, because the DownwardAPI doesn't support providing such information. To support this is tracked in kubernetes/kubernetes#80346, but since its a problem for any container orcestration tool an even more foundational improvement idea is tracked via opencontainers/runtime-spec#1105.

While its possible to grant permissions to the container to use kubectl to self-inspect etc, its not worth persuing as it would put constraints on the image and add notable complexity.

@consideRatio
Copy link
Contributor Author

Using k8s mutating webhooks

Its possible to register a "mutating webhook" that changes the specifications of for example a k8s Pod before its getting fully registered by the k8s api-server and thereafter scheduled to a node and started.

Such modification could look for containers with image tags like lookup-latest and try to convert them into a digest to reference instead. This is a hacky workaround and adds significant complexity as well, making it too complicated to flurish as a well documented solution adopted by ourselves and others, so I'd say its not worth persuing.

Complexities are:

  • An entire application needs to be developed, deployed, and registered as a mutating webhook against k8s api-server
  • Documentation specific on how to integrate with the webhook needs to be exposed to our communities, but its conditional on having this webhook deployed and functional
  • If you lookup a tag, it can have multiple digest depending on the CPU architecture it will run on, but maybe you won't know the CPU arch before the pod is scheduled.

@consideRatio
Copy link
Contributor Author

consideRatio commented Feb 7, 2024

Conclusion from exploration

I don't think its feasable at this time to have a user server started with latest tag and then get k8s etc to figure out what image digest or tag besides latest ended up used.

Unless kubernetes/kubernetes#80346 resolves to support this, I think there is no good path forward. One could consider providing automation/helpers to let JupyterHub/KubeSpawner never reference latest but instead try to translate it to an image tag - offloading k8s from needing to do this. I suggest this isn't perusued either due to complexity / value trade-offs.

@github-project-automation github-project-automation bot moved this from Needs Shaping / Refinement to Complete in DEPRECATED Engineering and Product Backlog Feb 7, 2024
@consideRatio consideRatio changed the title Explore translating latest to a digest, and possibly then other tags, to improve reproducibility Explore translating latest to a digest and possibly then other tags to improve reproducibility Feb 7, 2024
@consideRatio consideRatio self-assigned this Feb 7, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

1 participant