-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
provide a portable mechanism for processes within container to obtain their image and container ids #1105
Comments
OTOH this goes against the idea of isolating containers from the system. A common unprivileged container has no business knowing where on the host its files are located (or even if they are located on the host), and the easiest definitions of image/container IDs would probably point to specific host paths for the image store and per-container mounts. My impression of the logging use cases is that they both can, and for better security should, be handled by a node-level log forwarder/collector, not by each container annotating its own output. Doing this in a collector is:
If this needs to exist at all, I’d prefer standardizing the format and semantic guarantees (e.g. is the ID required to be stable / required to be different across reboots? across hosts in what kind of domain?), but only making them available if the user deploying the container explicitly opts in. |
"A common unprivileged container has no business knowing where on the host its files are located (or even if they are located on the host), and the easiest definitions of image/container IDs would probably point to specific host paths for the image store and per-container mounts." I completely agree with your remark regardind container isolation, however I am not proposing exposing any location information whatsoever, I am only proposing exposing the image digest value and the container instance id, and not any host (or other) path information. Where the container filesystem is located on a host is not useful for any of the use cases, the identity of the image, and an instance of it are, and while an external logger could potentially inject these values, this makes the task of gathering all |
At least in one implementation an image (config) digest directly points to a node path Of course actually exploiting that would require a sandbox breakout, but it’s a piece of the puzzle. |
just to be clear I am proposing exposing the value of:
|
Another option, instead of environment variables, would be mounting a file inside the container. That file could be built in tmpfs, mounted read-only to a well known location like And I think we all agree, anything we do to implement this should only be on the container side (and associated image), no host level details like paths, hostname, or IP address of the host running the container should be visible from inside the container. |
I really like this proposal, (I think its also "compatible" with the mechanism employed by the k8s downwards API). I would also +1 your suggestion regarding adding the image tags to this metadata, since those also are known to, and significant, to the components and tools that form the container ecosystem, I had actually wondered about those also. |
Related is this old issue from Docker: moby/moby#8427 |
Compare GHSA-c3xm-pvg7-gh7r where an attacker benefits from knowing the pod ID in Kubernetes. |
I like this one. I am working on observability. Sometimes, it is hard to solve a problem on the container infrastructure, says Kubernetes has the downwards API that can cast Pods' metadata to container env variables, but it cannot cast a container id; it can only cast something like the pod name. However, the pod name only provides a little help in identifying the container since the pod name is not unique, may be reused, on k8s, like the pods created by StatefulSet. And, to find a mapping from the container name to container id, we need to search the pod schedule logs or watch the k8s API. There are other ways like deploying a sidecar that can access the k8s API or the container daemon to fetch the container id, but it is a kind of wasting resources (We just want a container id), and it brings security concerns. Therefore, these approaches are not perfect and only work on k8s. What about containers running on the bare metal? I do find a way to fetch container id within a container by reading the cgroup file I would like to help to settle a standard way to fetch container id within the container. But I am not sure about anyone from the community is working on this. Anything I can do to make it happen? |
we are facing the same problem, we want to collect coredump in containers, and some jvm logs and metrics, we need to distinct different lifecycle of a single container. we need to know the container id before starting our busyness process, then i can set jvm logs, and metrics to a path with container id suffix. |
@opencontainers/runtime-spec-maintainers PTAL |
As there is no standard for the "image ID" and the "container ID", I'd suggest using the Kubernetes downward API: https://kubernetes.io/docs/concepts/workloads/pods/downward-api/ The downward API doesn't seem supporting the image ID, though, but I guess it is open to negotiation if somebody needs it. Non-Kubernetes engines may also opt-in to follow the similar convention. |
Related issue on Kubernetes side: kubernetes/kubernetes#80346 |
There is also a patch to define the "container ID" on the kernel side for auditing |
To clarify a little bit the hesitation from the runtime spec maintainers here (hopefully if I'm off-base the maintainers who disagree with me will pipe up!), the spec today does not currently have any concept of image ID or even an image at all (it's closer to a very souped-up To put this in more practical terms, the runtime spec implementation is |
@sudo-bmitch thanks for your guidance during the call:-) @AkihiroSuda, thanks for your fast response. Using the kubernetes downward API is another option I am pursuing right now, @mitar raised a KEP a while back for the imageID (although I am more interested in the containerID) and I want to get this conversation going again. Although this does only fix the issue for kubernetes and not for all the other setups out there. To add a little bit of context: We are currently trying to get container.id detection available across OpenTelemetry SDKs to enable correlation between infra and application telemetry, some PRs/issues on that:
Overall there are 2 "hacks" right now, depending on the cgroup version:
This works for some, but of course not all, container runtimes, and @XSAM figured out that in k8s those IDs might be "incorrect" coming from a prior pause container. So it's a hack, and I was hoping for a reliable way. To be honest I was not aware that there is no concept of an Image ID / container ID, I just assumed this is a "given", thank for the the clarification @tianon & @AkihiroSuda and it helps me to understand why this is a complicated issue. Again, the end goal I have in mind is a reliable way to connect application telemetry & container(+other infrastructure) telemetry eventually. |
@tianon thanks, makes sense, and I would agree that expanded spec scope is not to be undertaken lightly if at all. having said that this is an issue that affects both managed runtimes (in containers) such as the JVM (and others) and simple application(s) also. longer term I believe this needs to be solved in a portable manner such that orchestrators are able to communicate this to containers w/o the container's having to determine which orchestrator is orchestrating. |
This comment was marked as spam.
This comment was marked as spam.
Without knowing all the inner workings of the runtime spec I am wondering if something like the following works:
E.g there MAY be an environment variable OCI_ID that may hold details to identify the container uniquely Or there MAY be a file /etc/container-id (just making that up for Linux, well aware that this is probably not the right place) holding details that allows identification of the runtime. Again, apologies, I am not the expert (yet) on choosing the right words but I hope I can bring my idea across. Edit: This would provide as a minimum a conancial place where this data may be dropped off by the container engine and where a monitoring/observability solution can find it. Much better than the 2 places we have right now as a hack. |
@AkihiroSuda @tianon @sudo-bmitch any thoughts on my last comment? Would there be a way to have such an optional feature for the runtime spec, so implementations are not forced to have it, but those wo do it have a fixed place to put it and consumers (=observability/monitoring) can look for them but will still work without them |
I think It should be addressed at a higher level. It shouldn't end up in the runtime-spec that deals only with the lower-level stuff. From the runtime-spec PoV it is either an ENV variable or a bind mount. |
thanks. When you say "higher level", what are the potential candidates from your point of view? |
I think this kind of logic should go into the container engine, or really anything that calls the OCI runtime (Podman, Docker, containerd, CRI-O...) |
I agree (and agree with most what's said); I don't think this should be defined as part of the runtime spec. I can see the use-case(s) for having more information available from within the container, but care should be taken;
That said I can think of (and know of) various use-cases where (some amount of) information is useful, and currently there's no formalized / portable way (there are options, such as the aforementioned kubernetes downward API, but depending on the ecosystem). From that perspective, I could see value in some specification for introspection if there's common ground (orchestrated/non-orchestrated), and if such a spec would be flexible / modular / portable to be useful for different ecosystems. I should probably also mention that various discussions have led to "responsibility of higher level runtimes", which currently isn't a formal concept; at times this feels like a gap in the existing specifications, which makes me wonder if there would be room for a specification around that (may be hard though to find the common ground on that, but perhaps?). |
This is exactly the reason why I was hoping to make it part of the runtime spec: changing it here would be a one time thing, whereas changing it in higher levels would require me to chase down each and every engine or orchestration or ... and propose to them to implement something like that (sure it will get easier the moment the "big" ones are onboard). Monitoring/Observability (and everything adjacent to it like auditing, verifiability, etc.) always come with breaking the design principle of separations of concerns (i.e. the OpenTelemetry spec exactly states that), because what I want to know is, which of my applications running in which container on which cluster on which hardware in which datacenter is the trouble maker. So, yes, ideally a process should not be aware that they are containerised, but if I want to do monitoring I have no other choice then have this information present, somehow. Of course there are also ways to enrich this information later, but they come with their own (security) issues. I get that this is expanding the scope of the runtime spec, and I understand that it is a sliding slope, so as said initially, I was hoping for making it possible here, but I am aware that the answer is probably a "No" and this needs to be solved somewhere else. |
I'd like this expanded to provide everything for the OpenTelemetry container spec. Each otel vendor does data correlation differently, so best to have all the data to be certain it works. |
+1 |
@thaJeztah @giuseppe @tianon @AkihiroSuda any final call on this? It looks like the answer is "out of scope", so I will see to find a way to get this accomplished somewhere in a higher level. |
That issue is talking about image ID, for container ID I think you meant: |
You're right, I quoted the one for imageID because my initial assumption was that both could be treated equally, but I might be wrong here, since imageID is available before the container is created. |
Image ID is also far less interesting, because you can just inject this value yourself - it's part of the pod's static metadata (e.g. kubernetes/kubernetes#80346 (comment)) The container ID on the other hand is completely dynamic, and changes every time the container restarts - which can happen multiple times during the lifetime of the pod (OOM, liveness failure, etc). |
That is not true. If you use anything except digest-based image ID in your pod specification, then you cannot really know which version of the image was fetched to run your container. Label/tag can point to different images at different times. For debugging it is critical to know which exactly version was running. |
Fair point, but this is a very good reason to do as we do and always use digest-based image IDs :) |
If it works for your workflow. But sometimes it is OK to pick the latest image whichever it is, you just want to know which one has been picked. |
FYI, I followed the suggestion to raise this with container engines and opened containerd/containerd#8185 with the containerd project |
There are a number of use cases where the ability to portably obtain a container's image identity and its instance identity.
The most obvious is for logging; in larger systems where there are a significant # of container instances and image versions it is useful for processes therein to be able to "tag" their log output with both identities which as a tuple uniquely identify a particular instance from all its peers.
Other use cases, involve platform runtime serviceability components (e.g in the Java platform) where this information can also be used by the platform in order to correlate multiple forensic artifacts that may be generated over the lifetime of a particular container version(s) and instance(s).
This could simply be provided by requiring that the r/t expose those identifiers to processes therein by establishing a pair of standard environment variables such as OCI_IMAGE_ID and OCI_INSTANCE_ID or similar.
The text was updated successfully, but these errors were encountered: