Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Release-1.29] - CNI bin dir changes with K3s version #10930

Closed
brandond opened this issue Sep 23, 2024 · 1 comment
Closed

[Release-1.29] - CNI bin dir changes with K3s version #10930

brandond opened this issue Sep 23, 2024 · 1 comment
Assignees
Milestone

Comments

@brandond
Copy link
Member

Backport fix for CNI bin dir changes with K3s version

@aganesh-suse
Copy link

Validated on release-1.29 branch with commit 1aa204b

Environment Details

Infrastructure

  • Cloud
  • Hosted

Node(s) CPU architecture, OS, and Version:

$ cat /etc/os-release
PRETTY_NAME="Ubuntu 24.04 LTS"

$ uname -m
x86_64

Cluster Configuration:

HA: 3 server/ 1 agent

Config.yaml:

token: xxxx
cluster-init: true
write-kubeconfig-mode: "0644"
node-external-ip: 1.1.1.1
node-label:
- k3s-upgrade=server

Reproduce issue multus_whereabouts_repro.yaml:

apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
  name: multus
  namespace: kube-system
spec:
  repo: https://rke2-charts.rancher.io
  chart: rke2-multus
  targetNamespace: kube-system
  valuesContent: |-
    manifests:
      configMap:
        true
    config:
      fullnameOverride: multus
      cni_conf:
        confDir: /var/lib/rancher/k3s/agent/etc/cni/net.d
        binDir: /var/lib/rancher/k3s/data/current/bin
        kubeconfig: /var/lib/rancher/k3s/agent/etc/cni/net.d/multus.d/multus.kubeconfig
    rke2-whereabouts:
      fullnameOverride: whereabouts
      enabled: true
      cniConf:
        confDir: /var/lib/rancher/k3s/agent/etc/cni/net.d
        binDir: /var/lib/rancher/k3s/data/current/bin

Validation of fix multus_whereabouts_verify.yaml:

apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
  name: multus
  namespace: kube-system
spec:
  repo: https://rke2-charts.rancher.io
  chart: rke2-multus
  targetNamespace: kube-system
  valuesContent: |-
    manifests:
      configMap:
        true
    config:
      fullnameOverride: multus
      cni_conf:
        confDir: /var/lib/rancher/k3s/agent/etc/cni/net.d
        binDir: /var/lib/rancher/k3s/data/cni
        kubeconfig: /var/lib/rancher/k3s/agent/etc/cni/net.d/multus.d/multus.kubeconfig
    rke2-whereabouts:
      fullnameOverride: whereabouts
      enabled: true
      cniConf:
        confDir: /var/lib/rancher/k3s/agent/etc/cni/net.d
        binDir: /var/lib/rancher/k3s/data/cni

Testing Steps for issue reproduction:

  1. Copy config.yaml
$ sudo mkdir -p /etc/rancher/k3s && sudo cp config.yaml /etc/rancher/k3s
  1. Install k3s
curl -sfL https://get.k3s.io | sudo INSTALL_K3S_VERSION='v1.29.8+k3s1' sh -s - server
  1. Verify multus+whereabouts pods are coming up and the binary locations as per the applied yamls:
$ kubectl apply -f multus_whereabouts_repro.yaml
  1. Upgrade to latest version.
$ curl -sfL https://get.k3s.io | sudo INSTALL_K3S_VERSION='v1.29.9+k3s1' sh -s - server
  1. Check the multus + whereabouts - if they are working fine.
/var/lib/rancher/k3s/data/current/bin/multus
/var/lib/rancher/k3s/data/current/bin/whereabouts

Testing Steps for validation of fix:

  1. Copy config.yaml
$ sudo mkdir -p /etc/rancher/k3s && sudo cp config.yaml /etc/rancher/k3s
  1. Install k3s
For validation: 
curl -sfL https://get.k3s.io | sudo INSTALL_K3S_COMMIT='9510ac25fefb82703790c4c4a645eac8af62a551' sh -s - server
  1. Verify multus+whereabouts pods are coming up and the binary locations as per the applied yamls:
$ kubectl apply -f multus_whereabouts_verify.yaml
  1. Upgrade to latest commit.
curl -sfL https://get.k3s.io | sudo INSTALL_K3S_COMMIT='1aa204be5bd1c5a205aa7f369664a3d99b3b6beb' sh -s - server
  1. Check the multus + whereabouts - if they are working fine.
/var/lib/rancher/k3s/data/cni/multus
/var/lib/rancher/k3s/data/cni/whereabouts

Replication Results:

  • k3s version used for replication:
$ k3s -v
k3s version v1.29.9+k3s1 (e92d3b3b)
go version go1.22.6

Pre-upgrade:

$ /var/lib/rancher/k3s/data/current/bin/multus 
meta-plugin that delegates to other CNI plugins
CNI protocol versions supported: 0.1.0, 0.2.0, 0.3.0, 0.3.1, 0.4.0, 1.0.0, 1.1.0

$ /var/lib/rancher/k3s/data/current/bin/whereabouts
whereabouts v0.8.0-8c381170 linux/amd64
CNI protocol versions supported: 0.1.0, 0.2.0, 0.3.0, 0.3.1, 0.4.0, 1.0.0, 1.1.0

Post-upgrade:

$ /var/lib/rancher/k3s/data/current/bin/multus 
bash: line 1: /var/lib/rancher/k3s/data/current/bin/multus: No such file or directory

$ /var/lib/rancher/k3s/data/current/bin/whereabouts
bash: line 1: /var/lib/rancher/k3s/data/current/bin/whereabouts: No such file or directory

Validation Results:

  • k3s version used for validation:
$ k3s -v
k3s version v1.29.9+k3s-1aa204be (1aa204be)
go version go1.22.6

Pre-upgrade:

$ /var/lib/rancher/k3s/data/cni/multus 
meta-plugin that delegates to other CNI plugins
CNI protocol versions supported: 0.1.0, 0.2.0, 0.3.0, 0.3.1, 0.4.0, 1.0.0, 1.1.0

$ /var/lib/rancher/k3s/data/cni/whereabouts
whereabouts v0.8.0-8c381170 linux/amd64
CNI protocol versions supported: 0.1.0, 0.2.0, 0.3.0, 0.3.1, 0.4.0, 1.0.0, 1.1.0

Post-upgrade:

$ /var/lib/rancher/k3s/data/cni/multus 
meta-plugin that delegates to other CNI plugins
CNI protocol versions supported: 0.1.0, 0.2.0, 0.3.0, 0.3.1, 0.4.0, 1.0.0, 1.1.0

$ /var/lib/rancher/k3s/data/cni/whereabouts
whereabouts v0.8.0-8c381170 linux/amd64
CNI protocol versions supported: 0.1.0, 0.2.0, 0.3.0, 0.3.1, 0.4.0, 1.0.0, 1.1.0

Other verifications:
config.toml has the cni config and bin directory info:

$ sudo cat /var/lib/rancher/k3s/agent/etc/containerd/config.toml | grep cni
[plugins."io.containerd.grpc.v1.cri".cni]
  bin_dir = "/var/lib/rancher/k3s/data/cni"
  conf_dir = "/var/lib/rancher/k3s/agent/etc/cni/net.d"
$ sudo ls -la /var/lib/rancher/k3s/data/ 
total 28
drwxr-xr-x 5 root root 4096 Oct 22 22:10 .
drwxr-xr-x 5 root root 4096 Oct 22 22:00 ..
-rw------- 1 root root    0 Oct 22 22:00 .lock
drwxr-xr-x 4 root root 4096 Oct 22 22:00 30d1829837750862f1fa6ab8ef1706f795c84a43feeb01c7f45fa567b265f848
drwxr-xr-x 4 root root 4096 Oct 22 22:10 4183604786a8622835b3cb6acf31a8d840a4178e5914b37c80882195ff8efedc
drwxr-xr-x 2 root root 4096 Oct 22 22:10 cni
lrwxrwxrwx 1 root root   90 Oct 22 22:10 current -> /var/lib/rancher/k3s/data/4183604786a8622835b3cb6acf31a8d840a4178e5914b37c80882195ff8efedc
lrwxrwxrwx 1 root root   90 Oct 22 22:00 previous -> /var/lib/rancher/k3s/data/30d1829837750862f1fa6ab8ef1706f795c84a43feeb01c7f45fa567b265f848
$ sudo ls -la /var/lib/rancher/k3s/data/cni 
total 159908
drwxr-xr-x 2 root root     4096 Oct 22 22:10 .
drwxr-xr-x 5 root root     4096 Oct 22 22:10 ..
-rwxr-xr-x 1 root root  5044864 Oct 22 22:08 bandwidth
-rwxr-xr-x 1 root root  5480992 Oct 22 22:08 bridge
lrwxrwxrwx 1 root root       98 Oct 22 22:10 cni -> /var/lib/rancher/k3s/data/4183604786a8622835b3cb6acf31a8d840a4178e5914b37c80882195ff8efedc/bin/cni
-rwxr-xr-x 1 root root 10813312 Oct 22 22:08 dhcp
-rwxr-xr-x 1 root root  5177248 Oct 22 22:08 dummy
-rwxr-xr-x 1 root root  5509312 Oct 22 22:08 firewall
lrwxrwxrwx 1 root root       98 Oct 22 22:10 flannel -> /var/lib/rancher/k3s/data/4183604786a8622835b3cb6acf31a8d840a4178e5914b37c80882195ff8efedc/bin/cni
-rwxr-xr-x 1 root root  5120000 Oct 22 22:08 host-device
-rwxr-xr-x 1 root root  4614752 Oct 22 22:08 host-local
-rwxr-xr-x 1 root root  5185440 Oct 22 22:08 ipvlan
-rwxr-xr-x 1 root root  2736912 Oct 22 22:08 loopback
-rwxr-xr-x 1 root root  5210048 Oct 22 22:08 macvlan
-rwxr-xr-x 1 root root 39183128 Oct 22 22:08 multus
-rwxr-xr-x 1 root root  5078848 Oct 22 22:08 portmap
-rwxr-xr-x 1 root root  5329472 Oct 22 22:08 ptp
-rwxr-xr-x 1 root root  2893264 Oct 22 22:08 sbr
-rwxr-xr-x 1 root root  2428240 Oct 22 22:08 static
-rwxr-xr-x 1 root root  5232128 Oct 22 22:08 tap
-rwxr-xr-x 1 root root  2798512 Oct 22 22:08 tuning
-rwxr-xr-x 1 root root  5185440 Oct 22 22:08 vlan
-rwxr-xr-x 1 root root  3001104 Oct 22 22:08 vrf
-rwxr-xr-x 1 root root 37671392 Oct 22 22:08 whereabouts
$ sudo ls -lrt /var/lib/rancher/k3s/agent/etc/cni/net.d 
total 16
drwxr-xr-x 2 root root 4096 Oct 22 22:08 whereabouts.d
drwxr-xr-x 2 root root 4096 Oct 22 22:08 multus.d
-rw------- 1 root root  623 Oct 22 22:08 00-multus.conflist
-rw-r--r-- 1 root root  406 Oct 22 22:10 10-flannel.conflist

Verifications for the CNI symlink exists fatal error:

To reproduce: Upgrade from commit 74ce150 to commit 9510ac2
To validate: Upgrade from commit 74ce150 to commit 1aa204b

Test steps for this: #10869 (comment)

On a repro setup, you will see logs:

$ journalctl -xeu k3s | grep 'fatal' 
Oct 23 16:23:53 ip-172-31-11-9 k3s[86376]: time="2024-10-23T16:23:53Z" level=fatal msg="extracting data: symlink /var/lib/rancher/k3s/data/30d1829837750862f1fa6ab8ef1706f795c84a43feeb01c7f45fa567b265f848/bin/cni /var/lib/rancher/k3s/data/cni/cni: file exists"

On the validated setup:
No fatal errors on all servers/agents:

$ journalctl -xeu k3s | grep 'fatal' 

@github-project-automation github-project-automation bot moved this from To Test to Done Issue in K3s Development Oct 23, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
Status: Done Issue
Development

No branches or pull requests

3 participants