Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Example crictl run/runp fail on a machine with a running k8s CP #1696

Open
RonBarkan opened this issue Nov 22, 2024 · 6 comments
Open

Example crictl run/runp fail on a machine with a running k8s CP #1696

RonBarkan opened this issue Nov 22, 2024 · 6 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@RonBarkan
Copy link

RonBarkan commented Nov 22, 2024

What happened:

On a Linux system with a successfully running single node Kubernetes control plane, with containerd, I am using the example run/runp commands here and here, and I am getting the following errors:

$ sudo crictl -r unix:///run/containerd/containerd.sock runp /tmp/nginx-pod.json 
E1122 19:08:31.584796 3158795 remote_runtime.go:193] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: 
runc create failed: expected cgroupsPath to be of format \"slice:prefix:name\" for systemd cgroups, got \"/k8s.io/e5a83c8255cf21db9fa18c1999cb571db2139e87ed0c592324e851117eefc9f6\" instead: unknown"

and

$ sudo crictl -r unix:///run/containerd/containerd.sock run /tmp/container.json /tmp/nginx-pod.json
E1122 19:12:17.887097 3159492 remote_runtime.go:193] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: 
runc create failed: expected cgroupsPath to be of format \"slice:prefix:name\" for systemd cgroups, got \"/k8s.io/7f31c4319bc73ca556da493fee2f7c28abef514e0103e7277f766556da9c0d8f\" instead: unknown"

Content of the files (copied from above links):

$ cat /tmp/container.json 
{
  "metadata": {
      "name": "busybox"
  },
  "image":{
      "image": "busybox"
  },
  "command": [
      "top"
  ],
  "log_path":"busybox.0.log",
  "linux": {
  }
}
$ cat /tmp/nginx-pod.json 
{
    "metadata": {
        "name": "nginx-sandbox",
        "namespace": "default",
        "attempt": 1,
        "uid": "hdishd83djaidwnduwk28bcsb"
    },
    "log_directory": "/tmp",
    "linux": {
    }
}

What you expected to happen:

The examples to work.

How to reproduce it (as minimally and precisely as possible):

Installed containerd version 1.6.12 through apt. crictl is v1.31.1 and v1.28.0.

The config.toml was generated using:

containerd config default | sed "s/SystemdCgroup *= *false/SystemdCgroup = true/" | sudo tee /etc/containerd/config.toml

Which means it uses SystemdCgroups = true.

Anything else we need to know?:

Cilium with kube-proxy is installed on the healthy Kubernetes control plane.

In case this is important:

sudo cat /var/lib/kubelet/config.yaml | grep cgroup
cgroupDriver: systemd

Environment:

  • Container runtime or hardware configuration:
    • containerd 1.6.12
    • crictl v1.31.1 and v1.28.0
    • kubelet (presumably not relevant): v1.29.6
  • OS (e.g: cat /etc/os-release): Debian GNU/Linux rodete
  • Kernel (e.g. uname -a): 6.9.10-1rodete5-amd64
  • Others:
@RonBarkan RonBarkan added kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Nov 22, 2024
@kannon92
Copy link
Contributor

Reading this I don’t think this is a bug with crictl but with containerd. Your version is pretty old so I’d maybe ask containerd on this one.

@akhilerm
Copy link
Contributor

Can you also provide the contents of nginx-pod.json file? Are you setting the "cgroup_parent" field, because the change to get the cgroup driver from the container runtime was added in crictl 1.29.0 . Ref: #1302. In crictl 1.28.0, you will have to pass the cgroup_parent value, else it defaults to cgroupfs style syntax.

@RonBarkan
Copy link
Author

RonBarkan commented Dec 2, 2024

@akhilerm @kannon92

I have updated the description to show the content of the json files. I've also corrected the 1st link to the correct runp example.

I have downloaded crictl version 1.31.1, which results in an identical error message. Looks like the doc shows the same examples at the time #1302 was merged (see here).

I was not setting cgroup_parent and could not find any information about how to set it. If you think it is needed for version 1.31.1, please let me know how to configure it.

@youwalther65
Copy link

youwalther65 commented Jan 14, 2025

@akhilerm @kannon92 I got the same error using the runpexample from crictl GitHub here.
It would be helpful to update this with a working example for containerd using SystemdCgroup=true and kubelet using "cgroupDriver": "systemd".
I am running:

# containerd --version
containerd github.com/containerd/containerd 1.7.23 57f17b0a6295a39009d861b89e3b3b87b005ca27

@youwalther65
Copy link

youwalther65 commented Jan 14, 2025

I got one step further by using the following pod json

# cat pod-config.json
{
    "metadata": {
        "name": "nginx-sandbox",
        "namespace": "default",
        "attempt": 1,
        "uid": "hdishd83djaidwnduwk28bcsb"
    },
    "log_directory": "/tmp",
    "linux": {
        "cgroup_parent": "system.slice"
    }
}

# crictl  runp pod-config.json
469d6af600fe7991432a7067b5f28db77bd7c15f45befd571b0273beb7df38c6

But this sandbox ID wasn't visible and couldn't be used for create:

# crictl pods --id 469d6af600fe7991432a7067b5f28db77bd7c15f45befd571b0273beb7df38c6
POD ID              CREATED             STATE               NAME                NAMESPACE           ATTEMPT             RUNTIME

# crictl create 69d6af600fe7991432a7067b5f28db77bd7c15f45befd571b0273beb7df38c6 ctr.json pod-config.json
E0114 09:50:18.822738 3887719 remote_runtime.go:319] "CreateContainer in sandbox from runtime service failed" err="rpc error: code = NotFound desc = failed to find sandbox id \"69d6af600fe7991432a7067b5f28db77bd7c15f45befd571b0273beb7df38c6\": not found" podSandboxID="69d6af600fe7991432a7067b5f28db77bd7c15f45befd571b0273beb7df38c6"
FATA[0000] creating container: rpc error: code = NotFound desc = failed to find sandbox id "69d6af600fe7991432a7067b5f28db77bd7c15f45befd571b0273beb7df38c6": not found

Sometimes I see the pod in NotReady state for a view seconds , then it disappears.

This is independent of running OS Amazon Linux 2 (cgroupv1 based) or AL2023 (cgroupv2 based).

@youwalther65
Copy link

Using "cgroup_parent": "kubepods.slice" showed same result.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests

4 participants