Skip to content
This repository has been archived by the owner on Jun 28, 2024. It is now read-only.

clh: Failing on new k8s test "OOM events for pods" for cloud-hypervisor in kata 2.0 #2864

Closed
likebreath opened this issue Sep 19, 2020 · 3 comments · Fixed by #3574
Closed
Assignees
Labels
area/clh bug Incorrect behaviour

Comments

@likebreath
Copy link
Contributor

likebreath commented Sep 19, 2020

As a part of fixing the regression of clh CI for kata 2.0 (PR #2862), the failure on test OOM events for pods is observed (as reported in here):

22:21:16 1..1
22:21:48 not ok 1 Test OOM events for pods
22:21:48 # (in test file k8s-oom.bats, line 30)
22:21:48 #   `kubectl wait --for=condition=Ready pod "$pod_name"' failed
22:21:48 # INFO: k8s configured to use runtimeclass
22:21:48 # pod/pod-oom created
22:21:48 # error: timed out waiting for the condition on pods/pod-oom
22:21:48 # pod "pod-oom" deleted
22:21:48 Failed at 80: bats "${K8S_TEST_ENTRY}"
22:21:48 [reset] Reading configuration from the cluster...
22:21:48 [reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'

As this is a new test (added today #2854), it is advisable to debug on this in a separate PR, after bring CLH CI back online.

@likebreath likebreath added bug Incorrect behaviour needs-review Needs to be assessed by the team. labels Sep 19, 2020
likebreath added a commit to likebreath/kata-tests that referenced this issue Sep 19, 2020
During the process of fixing the regression on CLH CI, the failure on
'pod OOM' test is reported (kata-containers#2864). As this is a new k8s tests, it is
advisable to debug on this in a separate PR, after bring CLH CI back
online.

Signed-off-by: Bo Chen <chen.bo@intel.com>
likebreath added a commit to likebreath/kata-tests that referenced this issue Sep 19, 2020
During the process of fixing the regression on CLH CI, the failure on
'pod OOM' test is reported (kata-containers#2864). As this is a new k8s tests, it is
advisable to debug on this in a separate PR, after bring CLH CI back
online.

Depends-on: github.com/kata-containers/kata-containers#762

Fixes: kata-containers#2863

Signed-off-by: Bo Chen <chen.bo@intel.com>
@ariel-adam ariel-adam added area/clh and removed needs-review Needs to be assessed by the team. labels Jan 12, 2021
@fidencio
Copy link
Member

fidencio commented May 6, 2021

I've seen this happening on debian, and CentOS as well.

fidencio added a commit to fidencio/kata-tests that referenced this issue May 6, 2021
The default timeout has been increased recently and adjusted for some
tests that were failing.

This is a try-and-error kind of process and we'll keep adjusting such
timeouts accordingly to the errors we see coming from the tests.

Fixes: kata-containers#2864

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
@GabyCT
Copy link
Contributor

GabyCT commented May 26, 2021

Closing this issue as it has been solved by https://github.com/kata-containers/tests/pull/3499/files

@GabyCT GabyCT closed this as completed May 26, 2021
fidencio added a commit to fidencio/kata-tests that referenced this issue May 26, 2021
Fixes: kata-containers#2864

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
@fidencio
Copy link
Member

@GabyCT, sorry for re-opening this one, but I think we should only close this after re-enabling the test on clh:

if [ "${KATA_HYPERVISOR:-}" == "cloud-hypervisor" ]; then
sysctl_issue="https://github.com/kata-containers/tests/issues/2324"
info "$KATA_HYPERVISOR sysctl is failing:"
info "sysctls: ${sysctl_issue}"
oom_issue="https://github.com/kata-containers/tests/issues/2864"
info "$KATA_HYPERVISOR is failing on:"
info "pod oom: ${oom_issue}"
else

I've opened #3574 to check that.

@fidencio fidencio reopened this May 26, 2021
fidencio added a commit to fidencio/kata-tests that referenced this issue May 26, 2021
Fixes: kata-containers#2864

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Labels
area/clh bug Incorrect behaviour
Projects
None yet
4 participants