Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

fix: may kill other process when container has been stopped #1934

Merged
merged 1 commit into from
Nov 21, 2018

Conversation

lifubang
Copy link
Member

Signed-off-by: Lifubang lifubang@acmcoder.com

If a container A has been stopped, and after a long time, the host start a new process B whose process id is just equal to container A's process id.
When runc kill A 9 again, process B will be killed.
So, we need to check container's status when we want to kill it.

@cyphar
Copy link
Member

cyphar commented Nov 18, 2018

Yeah, that's obviously a bug. But I think a better solution would be to compare against InitProcessStartTime which should be different when there is a PID reuse attack (this is why we use it for other operations). I agree that doing this other check is probably also reasonable, but I think that using InitProcessStartTime would be more consistent.

As an aside, there was a kernel patch sent recently which would help avoid this problem entirely in a race-free way by holding open a file descriptor to /proc/$pid and then using ioctl(fd, PROC_FD_SIGNAL, SIGKILL).

@lifubang
Copy link
Member Author

As an aside, there was a kernel patch sent recently which would help avoid this problem entirely in a race-free way by holding open a file descriptor to /proc/$pid and then using ioctl(fd, PROC_FD_SIGNAL, SIGKILL).

It's a good news for taking us more security kernel patch. Thanks for all of you.

Yeah, that's obviously a bug. But I think a better solution would be to compare against InitProcessStartTime which should be different when there is a PID reuse attack (this is why we use it for other operations). I agree that doing this other check is probably also reasonable, but I think that using InitProcessStartTime would be more consistent.

In

runc/kill.go

Line 45 in 322760b

containerStatus, err := container.Status()

containerStatus, err := container.Status() contains ProcessStartTime compare.
Shall we need to change the code to just only use ProcessStartTime compare?

@lifubang
Copy link
Member Author

lifubang commented Nov 19, 2018

I refactor the code refer to

func (c *linuxContainer) Set(config configs.Config) error {
c.m.Lock()
defer c.m.Unlock()
status, err := c.currentStatus()
if err != nil {
return err
}
if status == Stopped {
return newGenericError(fmt.Errorf("container not running"), ContainerNotRunning)
}
.
And add a lock. I think we need a lock here.

@lifubang lifubang force-pushed the kill branch 3 times, most recently from 8f30cb5 to 26cf626 Compare November 19, 2018 07:12
@cyphar
Copy link
Member

cyphar commented Nov 19, 2018

This is what I was proposing.

diff --git a/libcontainer/container_linux.go b/libcontainer/container_linux.go
index 7d2c684484bc..bd4c2bc9fd16 100644
--- a/libcontainer/container_linux.go
+++ b/libcontainer/container_linux.go
@@ -371,6 +371,14 @@ func (c *linuxContainer) start(process *Process) error {
 }
 
 func (c *linuxContainer) Signal(s os.Signal, all bool) error {
+       // Make sure that the init process hasn't been recycled in the meantime.
+       initStartTime, err := c.initProcess.startTime()
+       if err != nil {
+               return newSystemErrorWithCause(err, "fetching init process start time")
+       }
+       if initStartTime != c.initProcessStartTime {
+               return fmt.Errorf("init process PID has been recycled")
+       }
        if all {
                return signalAllProcesses(c.cgroupManager, s)
        }

This should be enough to fix the issue (though maybe doing a stopped check is also reasonable).

@lifubang lifubang force-pushed the kill branch 3 times, most recently from 072f8b1 to 2ac93b0 Compare November 19, 2018 07:39
@lifubang
Copy link
Member Author

Thank you for your review. I remove the lock now.

@cyphar
Copy link
Member

cyphar commented Nov 21, 2018

The current patch is missing the hunk I suggested. I can push a patch for it if you like.

@lifubang
Copy link
Member Author

lifubang commented Nov 21, 2018

This should be enough to fix the issue (though maybe doing a stopped check is also reasonable).

@cyphar I still think we should doing a stopped check.
For your suggestion, it's just for check whether the pid is resue or not, if reused, it works fine, but if not reused when you check, it will use unix.Kill, if at this time, the pid is reuse?(Of course, a small probability)

If use a stopped check, we can reduce a system call and also reduce this error if pid is not reused.

root@dockerdemo:/opt/runctest/redis# ~/gocode/src/github.com/opencontainers/runc/runc kill redis 9
container_linux.go:386: signaling init process caused "os: process already finished"

@cyphar
Copy link
Member

cyphar commented Nov 21, 2018

Sorry, let me clarify. I think we should do both ("saving a syscall" doesn't really make much sense IMHO). But yeah, while testing I also noticed that you never hit the newSystemErrorWithCause(err, "fetching init process start time") case. I will look into that.

@cyphar
Copy link
Member

cyphar commented Nov 21, 2018

Never mind, currentStatus does actually do the initProcessStartTime check inside runType because currentStatus actually refreshes the container state internally. So this patch is fine and does the check I wanted.

@cyphar cyphar requested a review from crosbymichael November 21, 2018 01:35
@cyphar cyphar added this to the 1.0.0 milestone Nov 21, 2018
@cyphar
Copy link
Member

cyphar commented Nov 21, 2018

I've added this to 1.0 because it's actually a spec violation.

@lifubang lifubang force-pushed the kill branch 2 times, most recently from 9464763 to 87a1889 Compare November 21, 2018 09:44
Signed-off-by: Lifubang <lifubang@acmcoder.com>
@lifubang
Copy link
Member Author

@cyphar I update the code, because if the container use hosts pid namespace, we should let runc kill -a run success.

@lifubang
Copy link
Member Author

You can see:

func destroy(c *linuxContainer) error {
if !c.config.Namespaces.Contains(configs.NEWPID) {
if err := signalAllProcesses(c.cgroupManager, unix.SIGKILL); err != nil {
logrus.Warn(err)
}
}
err := c.cgroupManager.Destroy()

@cyphar
Copy link
Member

cyphar commented Nov 21, 2018

LGTM.

Approved with PullApprove

@crosbymichael
Copy link
Member

crosbymichael commented Nov 21, 2018

LGTM

Approved with PullApprove

@crosbymichael crosbymichael merged commit 50e2634 into opencontainers:master Nov 21, 2018
@lifubang lifubang deleted the kill branch November 26, 2018 02:03
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants