Skip to content

Commit

Permalink
rootfs: add mount destination validation
Browse files Browse the repository at this point in the history
Because the target of a mount is inside a container (which may be a
volume that is shared with another container), there exists a race
condition where the target of the mount may change to a path containing
a symlink after we have sanitised the path -- resulting in us
inadvertently mounting the path outside of the container.

This is not immediately useful because we are in a mount namespace with
MS_SLAVE mount propagation applied to "/", so we cannot mount on top of
host paths in the host namespace. However, if any subsequent mountpoints
in the configuration use a subdirectory of that host path as a source,
those subsequent mounts will use an attacker-controlled source path
(resolved within the host rootfs) -- allowing the bind-mounting of "/"
into the container.

While arguably configuration issues like this are not entirely within
runc's threat model, within the context of Kubernetes (and possibly
other container managers that provide semi-arbitrary container creation
privileges to untrusted users) this is a legitimate issue. Since we
cannot block mounting from the host into the container, we need to block
the first stage of this attack (mounting onto a path outside the
container).

The long-term plan to solve this would be to migrate to libpathrs, but
as a stop-gap we implement libpathrs-like path verification through
readlink(/proc/self/fd/$n) and then do mount operations through the
procfd once it's been verified to be inside the container. The target
could move after we've checked it, but if it is inside the container
then we can assume that it is safe for the same reason that libpathrs
operations would be safe.

A slight wrinkle is the "copyup" functionality we provide for tmpfs,
which is the only case where we want to do a mount on the host
filesystem. To facilitate this, I split out the copy-up functionality
entirely so that the logic isn't interspersed with the regular tmpfs
logic. In addition, all dependencies on m.Destination being overwritten
have been removed since that pattern was just begging to be a source of
more mount-target bugs (we do still have to modify m.Destination for
tmpfs-copyup but we only do it temporarily).

Fixes: CVE-2021-30465
Reported-by: Etienne Champetier <champetier.etienne@gmail.com>
Co-authored-by: Noah Meyerhans <nmeyerha@amazon.com>
Reviewed-by: Samuel Karp <skarp@amazon.com>
Reviewed-by: Kir Kolyshkin <kolyshkin@gmail.com> (@kolyshkin)
Reviewed-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
  • Loading branch information
cyphar and Noah Meyerhans committed May 19, 2021
1 parent c01a560 commit 0ca91f4
Show file tree
Hide file tree
Showing 4 changed files with 229 additions and 136 deletions.
25 changes: 16 additions & 9 deletions libcontainer/container_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -1217,7 +1217,6 @@ func (c *linuxContainer) makeCriuRestoreMountpoints(m *configs.Mount) error {
if err := checkProcMount(c.config.Rootfs, dest, ""); err != nil {
return err
}
m.Destination = dest
if err := os.MkdirAll(dest, 0755); err != nil {
return err
}
Expand Down Expand Up @@ -1257,13 +1256,16 @@ func (c *linuxContainer) prepareCriuRestoreMounts(mounts []*configs.Mount) error
umounts := []string{}
defer func() {
for _, u := range umounts {
if e := unix.Unmount(u, unix.MNT_DETACH); e != nil {
if e != unix.EINVAL {
// Ignore EINVAL as it means 'target is not a mount point.'
// It probably has already been unmounted.
logrus.Warnf("Error during cleanup unmounting of %q (%v)", u, e)
_ = utils.WithProcfd(c.config.Rootfs, u, func(procfd string) error {
if e := unix.Unmount(procfd, unix.MNT_DETACH); e != nil {
if e != unix.EINVAL {
// Ignore EINVAL as it means 'target is not a mount point.'
// It probably has already been unmounted.
logrus.Warnf("Error during cleanup unmounting of %s (%s): %v", procfd, u, e)
}
}
}
return nil
})
}
}()
for _, m := range mounts {
Expand All @@ -1281,8 +1283,13 @@ func (c *linuxContainer) prepareCriuRestoreMounts(mounts []*configs.Mount) error
// because during initial container creation mounts are
// set up in the order they are configured.
if m.Device == "bind" {
if err := unix.Mount(m.Source, m.Destination, "", unix.MS_BIND|unix.MS_REC, ""); err != nil {
return errorsf.Wrapf(err, "unable to bind mount %q to %q", m.Source, m.Destination)
if err := utils.WithProcfd(c.config.Rootfs, m.Destination, func(procfd string) error {
if err := unix.Mount(m.Source, procfd, "", unix.MS_BIND|unix.MS_REC, ""); err != nil {
return errorsf.Wrapf(err, "unable to bind mount %q to %q (through %q)", m.Source, m.Destination, procfd)
}
return nil
}); err != nil {
return err
}
umounts = append(umounts, m.Destination)
}
Expand Down
Loading

0 comments on commit 0ca91f4

Please # to comment.