Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Jobs that run privileged containers started failing #3673

Closed
1 of 7 tasks
orlangure opened this issue Jun 30, 2021 · 7 comments
Closed
1 of 7 tasks

Jobs that run privileged containers started failing #3673

orlangure opened this issue Jun 30, 2021 · 7 comments
Assignees
Labels
Area: Containers investigate Collect additional information, like space on disk, other tool incompatibilities etc. OS: Ubuntu

Comments

@orlangure
Copy link

Description

In gnomock there is an automated test that runs lightweight kubernetes distribution (k3s) inside a docker container. This test passed successfully 5 days ago, and started to fail consistently after the latest github environments upgrade.

The error that occurs inside the container is

151 conntrack.go:103] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
2021-06-30T11:22:53.9605151Z F0630 11:22:53.387138     151 server.go:495] open /proc/sys/net/netfilter/nf_conntrack_max: permission denied

but I'm not sure this is the root cause.

I have the same test running on circleci, and it continues to pass.

I have a few other jobs that set up docker containers, and they still work. The difference is that k3s job starts a privileged container.

Virtual environments affected

  • Ubuntu 16.04
  • Ubuntu 18.04
  • Ubuntu 20.04
  • macOS 10.15
  • macOS 11
  • Windows Server 2016
  • Windows Server 2019

Image version and build link

   Environment: ubuntu-20.04
  Version: 20210628.1
  Included Software: https://github.com/actions/virtual-environments/blob/ubuntu20/20210628.1/images/linux/Ubuntu2004-README.md
  Image Release: https://github.com/actions/virtual-environments/releases/tag/ubuntu20%2F20210628.1

Failed build: https://github.com/orlangure/gnomock/runs/2951632183?check_suite_focus=true
Successful build (5 days ago): https://github.com/orlangure/gnomock/runs/2916092414?check_suite_focus=true

Is it regression?

20210614.1

Expected behavior

No response

Actual behavior

No response

Repro steps

Run [preset] k3s job from Gnomock repository.

@dibir-magomedsaygitov dibir-magomedsaygitov added OS: Ubuntu Area: Containers investigate Collect additional information, like space on disk, other tool incompatibilities etc. and removed needs triage labels Jun 30, 2021
@dibir-magomedsaygitov
Copy link
Contributor

Hello @orlangure. Thank you for your report. We will take a look.

@lukaszo
Copy link

lukaszo commented Jun 30, 2021

We have the same error when using kind(K8s in Docker)

@al-cheb
Copy link
Contributor

al-cheb commented Jun 30, 2021

@orlangure, Looks like the issue is not with the image - rancher/rancher#33300 . Manually set up nf_conntrack_max=131072 do the trick:

    steps:
        - uses: actions/checkout@v2
          with:
            repository: 'orlangure/gnomock'
        - run: sudo sysctl -w net/netfilter/nf_conntrack_max=131072
        - name: Set up Go 1.16
          uses: actions/setup-go@v1
          with:
            go-version: 1.16
        - name: Get dependencies
          run: go get -v -t -d ./...
        - name: Test preset
          run: go test -race -cover -coverprofile=preset-cover.txt -coverpkg=./... -v ./preset/k3s/...
        - name: Test server
          run: go test -race -cover -coverprofile=server-cover.txt -coverpkg=./... -v ./internal/gnomockd -run TestK3s

image

@lukaszo, kind - rancher/rancher#33360

@orlangure
Copy link
Author

Manually set up nf_conntrack_max=131072 do the trick:

@al-cheb, interesting, but the image that I use for tests didn't change for a while (updated 8 months ago), and the tests passed until now. The only change that I noticed in the past days was github actions virtual environment upgrade.

From the linked issues it appears that the problem happens not only in github actions, so I assume it could be related to the kernel upgrade or some package that changed recently?

@al-cheb
Copy link
Contributor

al-cheb commented Jun 30, 2021

Manually set up nf_conntrack_max=131072 do the trick:

@al-cheb, interesting, but the image that I use for tests didn't change for a while (updated 8 months ago), and the tests passed until now. The only change that I noticed in the past days was github actions virtual environment upgrade.

From the linked issues it appears that the problem happens not only in github actions, so I assume it could be related to the kernel upgrade or some package that changed recently?

Yep, that's right the kernel was updated - https://github.com/actions/virtual-environments/releases/tag/ubuntu20%2F20210628.1 .

@al-cheb
Copy link
Contributor

al-cheb commented Jun 30, 2021

@orlangure, Could you please update an image to the latest version https://k3d.io/faq/faq/#solved-nodes-fail-to-start-or-get-stuck-in-notready-state-with-log-nf_conntrack_max-permission-denied to test the workaround?

@al-cheb al-cheb self-assigned this Jul 1, 2021
lukaszo added a commit to capactio/capact that referenced this issue Jul 1, 2021
It contains a fix for kubernetes-sigs/kind#2240

We've hit when running GitHub actions actions/runner-images#3673
@orlangure
Copy link
Author

Thanks @al-cheb, and sorry for late response.
The issue appears to be gone with 1.19.12 (the only one I tried so far).

I'll prepare an update for my users to let them know that older k3s versions won't work in Gnomock.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Area: Containers investigate Collect additional information, like space on disk, other tool incompatibilities etc. OS: Ubuntu
Projects
None yet
Development

No branches or pull requests

4 participants