Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Investigate sporadic CI failures #126

Closed
Furisto opened this issue Jul 8, 2021 · 3 comments
Closed

Investigate sporadic CI failures #126

Furisto opened this issue Jul 8, 2021 · 3 comments
Assignees

Comments

@Furisto
Copy link
Collaborator

Furisto commented Jul 8, 2021

The CI fails sporadically during the kill command test when calling the start command with " could not be started because it was Stopped". I have not yet been able to reproduce this behavior locally.

This is the output of youki (with additional traces) when running the test successfully.

[INFO src/state.rs:16] 2021-07-08T12:12:31.874083870+02:00 CALLED STATE
[INFO src/delete.rs:26] 2021-07-08T12:12:31.876735623+02:00 CALLED DELETE
[INFO src/create.rs:34] 2021-07-08T12:12:31.880261860+02:00 CALLED CREATE
[INFO src/cgroups/common.rs:155] 2021-07-08T12:12:31.891048269+02:00 cgroup manager V1 will be used
[WARN src/capabilities.rs:27] 2021-07-08T12:12:31.916619716+02:00 CAP_CHECKPOINT_RESTORE doesn't support.
[WARN src/capabilities.rs:27] 2021-07-08T12:12:31.916699714+02:00 CAP_PERFMON doesn't support.
[WARN src/capabilities.rs:27] 2021-07-08T12:12:31.916721314+02:00 CAP_BPF doesn't support.
[INFO src/state.rs:16] 2021-07-08T12:12:31.984294516+02:00 CALLED STATE
[INFO src/kill.rs:20] 2021-07-08T12:12:31.994463136+02:00 CALLED KILL
[INFO src/state.rs:16] 2021-07-08T12:12:32.003495876+02:00 CALLED STATE
[INFO src/delete.rs:26] 2021-07-08T12:12:32.007922697+02:00 CALLED DELETE
[INFO src/cgroups/common.rs:155] 2021-07-08T12:12:32.020250779+02:00 cgroup manager V1 will be used
[INFO src/state.rs:16] 2021-07-08T12:12:32.051025233+02:00 CALLED STATE
[INFO src/create.rs:34] 2021-07-08T12:12:32.054227576+02:00 CALLED CREATE
[INFO src/cgroups/common.rs:155] 2021-07-08T12:12:32.063796607+02:00 cgroup manager V1 will be used
[WARN src/capabilities.rs:27] 2021-07-08T12:12:32.116028281+02:00 CAP_PERFMON doesn't support.
[WARN src/capabilities.rs:27] 2021-07-08T12:12:32.116231877+02:00 CAP_BPF doesn't support.
[WARN src/capabilities.rs:27] 2021-07-08T12:12:32.116306276+02:00 CAP_CHECKPOINT_RESTORE doesn't support.
[INFO src/start.rs:19] 2021-07-08T12:12:32.145331061+02:00 CALLED START
[INFO src/state.rs:16] 2021-07-08T12:12:32.148588804+02:00 CALLED STATE
[INFO src/kill.rs:20] 2021-07-08T12:12:32.151353455+02:00 CALLED KILL
[INFO src/state.rs:16] 2021-07-08T12:12:32.153879310+02:00 CALLED STATE
[INFO src/delete.rs:26] 2021-07-08T12:12:32.156782258+02:00 CALLED DELETE
[INFO src/cgroups/common.rs:155] 2021-07-08T12:12:32.165717500+02:00 cgroup manager V1 will be used
[INFO src/state.rs:16] 2021-07-08T12:12:32.190686057+02:00 CALLED STATE
[INFO src/create.rs:34] 2021-07-08T12:12:32.193853601+02:00 CALLED CREATE
[INFO src/cgroups/common.rs:155] 2021-07-08T12:12:32.203325533+02:00 cgroup manager V1 will be used
[WARN src/capabilities.rs:27] 2021-07-08T12:12:32.251435881+02:00 CAP_BPF doesn't support.
[WARN src/capabilities.rs:27] 2021-07-08T12:12:32.251517179+02:00 CAP_CHECKPOINT_RESTORE doesn't support.
[WARN src/capabilities.rs:27] 2021-07-08T12:12:32.251542679+02:00 CAP_PERFMON doesn't support.
[INFO src/start.rs:19] 2021-07-08T12:12:32.277765314+02:00 CALLED START
[INFO src/state.rs:16] 2021-07-08T12:12:32.280738961+02:00 CALLED STATE
[INFO src/kill.rs:20] 2021-07-08T12:12:32.283505912+02:00 CALLED KILL
[INFO src/state.rs:16] 2021-07-08T12:12:32.286962351+02:00 CALLED STATE
[INFO src/delete.rs:26] 2021-07-08T12:12:32.289950498+02:00 CALLED DELETE
[INFO src/cgroups/common.rs:155] 2021-07-08T12:12:32.299679825+02:00 cgroup manager V1 will be used
[INFO src/state.rs:16] 2021-07-08T12:12:32.347732274+02:00 CALLED STATE

This is where we are checking the state.

pub fn refresh_status(&mut self) -> Result<Self> {
        let new_status = match self.pid() {
            Some(pid) => {
                // Note that Process::new does not spawn a new process
                // but instead creates a new Process structure, and fill
                // it with information about the process with given pid
                if let Ok(proc) = Process::new(pid.as_raw()) {
                    use procfs::process::ProcState;
                    match proc.stat.state().unwrap() {
                        ProcState::Zombie | ProcState::Dead => ContainerStatus::Stopped,
                        _ => match self.status() {
                            ContainerStatus::Creating | ContainerStatus::Created => self.status(),
                            _ => ContainerStatus::Running,
                        },
                    }
                } else {
                    ContainerStatus::Stopped
                }
            }
            None => ContainerStatus::Stopped,
        };
        Ok(self.update_status(new_status))
    }

It appears like the container process doesn't exist anymore. The question is why do we see this behavior only in the kill tests?

@utam0k
Copy link
Member

utam0k commented Aug 9, 2021

It doesn't happen much anymore...

@utam0k
Copy link
Member

utam0k commented Aug 9, 2021

How about we wait for this issue to be implemented?
#56

@utam0k
Copy link
Member

utam0k commented Aug 31, 2021

I haven't encountered it at all lately, so I'll close once.

@utam0k utam0k closed this as completed Aug 31, 2021
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants