You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As someone who wants to add traceability of the Docker image used to build an artifact I want a standard way of modeling Docker images available under many different names (e.g. debian:unstable and debian:unstable-20241016). So that I can track which exact image was used in the build but also which image:tag was requested.
Context
There are cases when artifacts are available under multiple names/aliases. The main case I want to solve for is Docker (OCI) images but there might be similar use cases with other kinds of artifacts. Such images have an identity based on the SHA-256 parts of the image's contents (the manifest.json file, I believe) but can have zero, one, or more names for convenience, but those names also make it possible to locate and download an image.
Image names have the general form [repository/]name[:tag], e.g. debian:unstable or registry.example.com/debian:unstable. These two example names could at some point refer to the same image, but tags are mutable so the meaning may change over time. It's also common for images to have multiple tags so suit different use cases. For example debian:latest points to the latest Debian image no matter what, debian:unstable points to the latest in the unstable series, and unstable-20241016 points to a specific release in the unstable series (and is probably immutable).
An ArtC can have an ENVIRONMENT link to a an ED, which in turn can have a RUNTIME_ENVIRONMENT link to another ArtC. The current limitation is that it isn't clear how to describe the image used to produce the artifact. You could link to an ArtC with data.identity containing the SHA-256 identity of the image, but then you can't express the intention of the environment, or the requested image. In other words, you can trace the exact image but not how we got there. It could matter whether the artifact was produced in an environment based on debian:unstable or debian:unstable-20241016.
Exemplification
There are a couple of problems we can solve:
What do images with mutable names (like debian:unstable) resolve to over time?
What exact Docker image was used in the build that produced an artifact?
What image did the build ask for?
Drawbacks
No response
Out of Scope
While there might be other cases of artifacts known under different names, no concrete examples have been identified and we're not trying to solve that problem preemptively. However, the solutions that use ArtP to express the names might be useful for other types of artifacts too.
Further links
No response
Acceptance Criteria
No response
Implementation Ideas
We discussed this matter at a community meeting on 2024-11-07 with two initial suggestions and a third one devised at the very end of the meeting.
New event type
The first proposal requires adding a new event type that expresses that an existing ArtC is also available under another identity. This wasn't dismissed, but the notion of artifacts having multiple identities didn't sit right with everyone. It was expressed that a Docker image can only have one identity, one that includes the SHA-256.
Use ArtP to express image names
Instead of the new event type, it was suggested that we use ArtP as a way to convey the location of the manifest.json file. That URI can trivially be transformed from e.g. https://registry.example.com/v2/debian/manifests/unstable to the corresponding string accepted by docker pull (registry.example.com/debian:unstable). To make this meaning of the URI more clear, an OCI_MANIFEST value for the data.locations.type enum seems like a good idea.
The drawback of this suggestion is that we don't get to know the requested image, only the concrete one in the ArtC.
Use ArtP to express image names and link to them from the ED
A variation of the previous suggestion that alleviates its main drawback is to allow the target of the RUNTIME_ENVIRONMENT link to be an ArtP, which isn't allowed right now.
The text was updated successfully, but these errors were encountered:
This was discussed (again) at the 2024-12-05 community meeting with the following conclusions:
The third option where the image name and registry location is expressed with the ArtP (implicitly, using the manifest URL), and which relies on extending ED to allow links to ArtP, was deemed the best option.
The ArtC for a container image shouldn't contain the registry URI since it goes against the spirit of the package URL. And if the registry URI is included in the purl, how would that be reconciled with the location given in the ArtP? What does it means if they're different?
As a corollary to the previous item, there should be exactly one ArtC for a given container image.
There are clear parallels to how Git works. A commit is identified by its contents, not how it's accessed (typically via a branch or a tag).
While this all constitutes a minor protocol change, the best practices need to be thoroughly documented.
Summary
As someone who wants to add traceability of the Docker image used to build an artifact
I want a standard way of modeling Docker images available under many different names (e.g. debian:unstable and debian:unstable-20241016).
So that I can track which exact image was used in the build but also which image:tag was requested.
Context
There are cases when artifacts are available under multiple names/aliases. The main case I want to solve for is Docker (OCI) images but there might be similar use cases with other kinds of artifacts. Such images have an identity based on the SHA-256 parts of the image's contents (the manifest.json file, I believe) but can have zero, one, or more names for convenience, but those names also make it possible to locate and download an image.
Image names have the general form [repository/]name[:tag], e.g. debian:unstable or registry.example.com/debian:unstable. These two example names could at some point refer to the same image, but tags are mutable so the meaning may change over time. It's also common for images to have multiple tags so suit different use cases. For example debian:latest points to the latest Debian image no matter what, debian:unstable points to the latest in the unstable series, and unstable-20241016 points to a specific release in the unstable series (and is probably immutable).
An ArtC can have an ENVIRONMENT link to a an ED, which in turn can have a RUNTIME_ENVIRONMENT link to another ArtC. The current limitation is that it isn't clear how to describe the image used to produce the artifact. You could link to an ArtC with data.identity containing the SHA-256 identity of the image, but then you can't express the intention of the environment, or the requested image. In other words, you can trace the exact image but not how we got there. It could matter whether the artifact was produced in an environment based on debian:unstable or debian:unstable-20241016.
Exemplification
There are a couple of problems we can solve:
Drawbacks
No response
Out of Scope
While there might be other cases of artifacts known under different names, no concrete examples have been identified and we're not trying to solve that problem preemptively. However, the solutions that use ArtP to express the names might be useful for other types of artifacts too.
Further links
No response
Acceptance Criteria
No response
Implementation Ideas
We discussed this matter at a community meeting on 2024-11-07 with two initial suggestions and a third one devised at the very end of the meeting.
New event type
The first proposal requires adding a new event type that expresses that an existing ArtC is also available under another identity. This wasn't dismissed, but the notion of artifacts having multiple identities didn't sit right with everyone. It was expressed that a Docker image can only have one identity, one that includes the SHA-256.
Use ArtP to express image names
Instead of the new event type, it was suggested that we use ArtP as a way to convey the location of the manifest.json file. That URI can trivially be transformed from e.g. https://registry.example.com/v2/debian/manifests/unstable to the corresponding string accepted by
docker pull
(registry.example.com/debian:unstable). To make this meaning of the URI more clear, an OCI_MANIFEST value for the data.locations.type enum seems like a good idea.The drawback of this suggestion is that we don't get to know the requested image, only the concrete one in the ArtC.
Use ArtP to express image names and link to them from the ED
A variation of the previous suggestion that alleviates its main drawback is to allow the target of the RUNTIME_ENVIRONMENT link to be an ArtP, which isn't allowed right now.
The text was updated successfully, but these errors were encountered: