Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Manually resolves links relatively to root_dir, and prevent escape #328

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

HorlogeSkynet
Copy link
Member

Dear Sebastian, dear maintainers,

This PR drafts a first implementation to address concerns raised in #311 review.

Currently, paths are not manually resolved, and by following symbolic links, this may lead to outside specified root_dir.

By using os.path.realpath and os.path.commonprefix, we can verify that important paths are still prefixed by root_dir, once they have been resolved.


It turned out enforcing those checks was not as easy as I thought :

  • some paths are provided on purpose as parameters, others automatically derived
  • some paths are known during instantiation, others will be "later"
  • I'm not sure PermissionError actually fits, but I was out of option
  • I'm not sure the error message itself is relevant either
  • New __prevent_root_dir_escape function might require a renaming too

At least, the new functional test actually enlightens the issue.

--> A deep review is required, ideas are welcome and commits on this branch too 🙃


Have a nice WE all 🙇

Copy link
Contributor

@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @HorlogeSkynet , maybe I'm misunderstanding something, but right now I only see the part that prevents "unjust" escapes from the chroot (from absolute links or from going above the root) but I don't see any code that would "bend" "legit" absolute symlinks back into the chroot.

Let me demo what I think we'd need in some form:

def resolve_chroot_symlink(link_location, root_dir):
    assert os.path.commonprefix([link_location, root_dir]) == root_dir

    while True:
        try:
            resolved = os.readlink(link_location)
        except FileNotFoundError:  # includes case "not a symlink"
            return link_location

        if resolved.startswith('/'):
            # i.e. absolute path (with regard to the chroot)
            # that we need to move back inside the chroot
            resolved = os.path.join(root_dir, resolved.lstrip('/'))
        else:
            # i.e. relative path that we make absolute
            resolved = os.path.join(os.path.dirname(link_location), resolved)

        if os.path.commonprefix([resolved, root_dir]) != root_dir:
            raise Value('ESCAPE DETECTED')  # TODO

        link_location = resolved

What do you think?

@HorlogeSkynet HorlogeSkynet force-pushed the security/prevent_rootfs_escape branch from 5ab542e to b9004e8 Compare March 5, 2022 11:23
@HorlogeSkynet
Copy link
Member Author

Dear Sebastian, sorry for the delay.
I got your point and understood what you were trying to achieve : I did miss the part about resolving symbolic links relatively to root_dir.

So I tweaked your function a bit in order to :

  • ease any future Windows integration task (see Support for Windows and Mac OS. #177)
  • detect symbolic links loop (as os.path.realpath does)
  • fix os.readlink usage on regular files (it actually raises OSError: "Invalid argument")
  • fix os.path.commonprefix usage by removing redundant "up-level references"

You will see that I opted for links resolution on the verge of calling open. This definitely simplifies the constructor.

I've implemented three short test cases to check our new logic (note : relative symbolic links resolving is already tested by regular rootfs under tests/resources/distros).

I also took the liberty to update our .gitignore file, as my working directory used not to be clean since #315.

Bye ! Thanks for your time 👋

@HorlogeSkynet HorlogeSkynet marked this pull request as ready for review March 5, 2022 11:35
@HorlogeSkynet HorlogeSkynet changed the title Manually resolves paths derived from root_dir to prevent rootfs escape Manually resolves links relatively to root_dir, and prevent escape Mar 5, 2022
Copy link
Contributor

@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @HorlogeSkynet kudos for the new tests and the loop detection! 👍 I believe a few details need more work, please see my comments below

@HorlogeSkynet HorlogeSkynet force-pushed the security/prevent_rootfs_escape branch from b9004e8 to c97754d Compare March 6, 2022 10:36
@HorlogeSkynet
Copy link
Member Author

HorlogeSkynet commented Mar 6, 2022

Hi @HorlogeSkynet kudos for the new tests and the loop detection! +1 I believe a few details need more work, please see my comments below

🙏
I've answered/resolved your comments Sebastian.
Although, I'm concerned about the CI failure, as tests pass "on my machine".
I think this is due to an issue related to GitHub Actions and symbolic links resolving to directories (I might be wrong).

My mistake ! There was an empty directory that Git didn't track, I've just fixed this 😉

Waiting for your inputs, thanks for your time 👋


EDIT : If you are tired of these round trips, feel free to directly amend the branch !

@HorlogeSkynet HorlogeSkynet force-pushed the security/prevent_rootfs_escape branch from c97754d to 5186ad6 Compare March 6, 2022 11:33
@HorlogeSkynet
Copy link
Member Author

PR's (finally) ready @hartwork !

@HorlogeSkynet HorlogeSkynet dismissed hartwork’s stale review June 13, 2022 19:47

Branch reworked

@HorlogeSkynet HorlogeSkynet requested a review from a team June 26, 2022 10:36
Copy link
Contributor

@hartwork hartwork left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @HorlogeSkynet, sorry for the big delay. I found some remaining issues. Please see my four comments below for details:

@HorlogeSkynet HorlogeSkynet force-pushed the security/prevent_rootfs_escape branch 3 times, most recently from a1af721 to 91de73f Compare July 30, 2022 16:12
@HorlogeSkynet HorlogeSkynet force-pushed the security/prevent_rootfs_escape branch from 5d9b5ba to e40187b Compare October 10, 2022 15:38
@HorlogeSkynet HorlogeSkynet marked this pull request as draft March 12, 2023 10:35
@HorlogeSkynet HorlogeSkynet force-pushed the security/prevent_rootfs_escape branch from 7dc0126 to a46f5a7 Compare April 20, 2024 08:49
@HorlogeSkynet HorlogeSkynet force-pushed the security/prevent_rootfs_escape branch 2 times, most recently from ddb9177 to 9911c30 Compare April 20, 2024 15:33
@HorlogeSkynet
Copy link
Member Author

Dear @python-distro/maintainers, Dear @hartwork (if you haven't muted notifications from this repository 🙃),

I've finally taken the time to rework this feature (fix ?).

What has been done :

  • os.path.commonpath has been preferred over commonprefix (first relative path normalization has been added to prevent standard library exception ;
  • 100% (99,99%) code coverage (for newly added code) ;
  • os.path.realpath have been removed from iteration to prevent OS links resolution, but there is one left for final path to detect some escape cases ;
  • Three new tests (two for os_release_file as well as another one for above chroot escape case) ;
  • root_dir is resolved, mainly to deal with .. (as you correctly pointed out Sebastian) ;
  • link check before resolution instead of try..except (indeed simpler) ;
  • OSError has been replaced by FileNotFoundError (symlinks loop corresponds to a missing file, at least from the caller PoV) ;
  • a "normalization" for relative paths has been added, to strip ../ that would legitimately resolve to chroot in a real-world situation.

This branch as also been rebased, and hence now "ready".

Thanks for your time,
Bye 👋

@HorlogeSkynet HorlogeSkynet marked this pull request as ready for review April 20, 2024 15:37
@HorlogeSkynet HorlogeSkynet requested review from hartwork and a team April 20, 2024 15:39
@HorlogeSkynet HorlogeSkynet force-pushed the security/prevent_rootfs_escape branch from 9911c30 to af1c8ad Compare April 21, 2024 09:26
@HorlogeSkynet HorlogeSkynet mentioned this pull request Apr 21, 2024
3 tasks
@hartwork
Copy link
Contributor

Dear @python-distro/maintainers, Dear @hartwork (if you haven't muted notifications from this repository 🙃),

Hi @HorlogeSkynet, back in 2022-08 — soon two years ago — I implemented a solution on branch security/prevent_rootfs_escape_2 that still exists today. I remember you were unsure if the approach was the best way forward and called out to all maintainers for feedback (it's archived in https://github.com/orgs/python-distro/teams/maintainers/members/archived_team_posts.json for everyone with read permissions); the topic and my branch has since been ignored altogether, until your new commits here 5 days ago. 2022 is a long time ago, and I'm not interested in getting back into this topic again in 2024, myself. Thanks for your understanding, I wish this pull request the best.

Explicitly disabling third-programs data sources is not required
since 2a89f76 (disabled by default if `root_dir` is set).
@HorlogeSkynet HorlogeSkynet force-pushed the security/prevent_rootfs_escape branch from af1c8ad to 7a85cfa Compare April 27, 2024 12:38
@HorlogeSkynet
Copy link
Member Author

HorlogeSkynet commented Apr 27, 2024

Hey @hartwork, and thanks for your answer. Indeed, I completely forgot about this (these) discussions, since GitHub archived them...

I took a quick glimpse of your branch which I find very complex (but this bug/feature is not trivial at all, and my new algorithm is possibly sill broken).
Although, I've updated a test case according to your commit 238925a.

So I'll sum up the situation here again, so @python-distro/maintainers will have the full context to properly review this (these) branches :


The idea is to "fix" the --root-dir option (brought in #161), which allows a separate rootfs to be specified (from CLI, or through LinuxDistribution API). As noticed first during #241, and then #311, multiple issues are related to this parameter in it's current implementation :

  • [fixed during Add support for AIX (closes #241) #311] this led to "unknown" binary executions, which possibly depend on host information (like the running kernel with uname) or even unsupported architecture (e.g. an ARM rootfs on an x86_64 host)
  • paths under root_dir are mis-resolved to paths outside root_dir : this means that opening $root_dir/etc/os-release which links to ../../../../../etc/os-release (or even to /etc/os-release) would result in host' own OS-RELEASE file being read (and distro info inferred from there)
  • as a consequence, symbolic link loops are not detected (a kind of DoS could happen in some cases)

So we (@hartwork and I) worked on these issues without opting for solutions as requiring system privileges or using a third-party dependency.
There are two "final" implementations :

  • The one proposed in this MR, which eventually addresses Sebastian's concerns, and which :

    1. Recursively resolves links relatively to root_dir until a non-link file is found (in a best-effort manner) ;
    2. Assert this file is actually under root_dir before returning it ;
    3. Any directory symlink encountered is not supported (they won't be "bent" in chroot).
  • Sebastian's (not rebased onto master, which manually resolves file paths relatively to root_dir (with full NT paths support ?)

From here, we see multiple options that I'd like to propose/discuss with you :

  1. Choose and keep one of these implementations, as is ;
  2. Implement one of these algorithms internally, as a separate module in the distro package, and fully-optional (try..except ModuleNotFoundError for projects that vendor distro.py script, disabling --root-dir option when missing) ;
  3. Implement this algorithm internally, as a separate module in the distro package, and required (import [...]) ;
  4. Implement this algorithm externally (a third package that we [both of us] would maintain on PyPI, this would adding a [extra ?] dependency to this project) ;
  5. Drop the --root-dir parameter completely, close this MR and release distro v2 ;
  6. Use a feature/trick that we do not know about, safely resolving links relatively to a specified directory (and cross platform ?).

Thanks for your time, see you all 👋

@HorlogeSkynet HorlogeSkynet force-pushed the security/prevent_rootfs_escape branch from 7a85cfa to dc19990 Compare April 27, 2024 13:16
@HorlogeSkynet HorlogeSkynet removed the request for review from hartwork April 27, 2024 13:16
@HorlogeSkynet HorlogeSkynet dismissed hartwork’s stale review April 27, 2024 13:16

Implementation reworked and rebased

This patch is a followup of #311 (2a89f76).

It appeared that we were not resolving paths when reading from files.
This means that a symbolic link present under `root_dir` could be
blindly followed _outside_ of `root_dir`, possibly leading to host
files.

Note : this patch **only** changes `root_dir` behavior.
@HorlogeSkynet HorlogeSkynet force-pushed the security/prevent_rootfs_escape branch from dc19990 to 5e7a64f Compare April 27, 2024 18:45
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants