Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

CVE-2024-1329 - Arbitrary Write Through Symlink Attack #19888

Closed
tgross opened this issue Feb 6, 2024 · 2 comments
Closed

CVE-2024-1329 - Arbitrary Write Through Symlink Attack #19888

tgross opened this issue Feb 6, 2024 · 2 comments

Comments

@tgross
Copy link
Member

tgross commented Feb 6, 2024

Nomad 1.7.4 and Nomad Enterprise 1.7.4 have been released with an important security update, as well as backports to Nomad and Nomad Enterprise 1.6.7 and 1.5.14.

A bug was discovered in Nomad’s template rendering that allows a malicious or compromised task to cause the template renderer to write to arbitrary files on the host as the Nomad client user (typically root) by using a symlink to bypass safety checks done in the client. All versions of Nomad and Nomad Enterprise are impacted.

To protect Nomad client hosts from this attack, Nomad now reads template sources and writes template destination files in a sandboxed subprocess.

Remediation
Users should upgrade Nomad to v1.7.4, 1.6.7, or 1.5.15. Upgrading the Nomad clients is sufficient to mitigate the bug, although we recommend keeping Nomad servers and clients on the same version. The mitigation for this can be deactivated by setting client.disable_file_sandbox=true on Nomad client configuration.

This remediation does not protect raw_exec tasks on Windows, which have unrestricted access to the host. The Nomad team strongly recommends against allowing raw_exec tasks with untrusted workloads.

Users on Windows who are running Windows containers with the docker task driver can further protect their clients against this attack by ensuring that Docker containers do not run as the default ContainerAdministrator user, but instead run as the ContainerUser user (which cannot create symlinks).

@tgross tgross added the type/bug label Feb 6, 2024
@tgross tgross added this to the 1.7.x milestone Feb 6, 2024
@tgross tgross self-assigned this Feb 6, 2024
@tgross tgross changed the title (placeholder) CVE-2024-1329 - Arbitrary Write Through Symlink Attack Feb 8, 2024
@tgross tgross closed this as completed Feb 8, 2024
tgross added a commit that referenced this issue Feb 8, 2024
The Nomad client renders templates in the same privileged process used for most
other client operations. During internal testing, we discovered that a malicious
task can create a symlink that can cause template rendering to read and write to
arbitrary files outside the allocation sandbox. Because the Nomad agent can be
restarted without restarting tasks, we can't simply check that the path is safe
at the time we write without encountering a time-of-check/time-of-use race.

To protect Nomad client hosts from this attack, we'll now read and write
templates in a subprocess:

* On Linux/Unix, this subprocess is sandboxed via chroot to the allocation
  directory. This requires that Nomad is running as a privileged process. A
  non-root Nomad agent will warn that it cannot sandbox the template renderer.

* On Windows, this process is sandboxed via a Windows AppContainer which has
  been granted access to only to the allocation directory. This does not require
  special privileges on Windows. (Creating symlinks in the first place can be
  prevented by running workloads as non-Administrator or
  non-ContainerAdministrator users.)

Both sandboxes cause encountered symlinks to be evaluated in the context of the
sandbox, which will result in a "file not found" or "access denied" error,
depending on the platform. This change will also require an update to
Consul-Template to allow callers to inject a custom `ReaderFunc` and
`RenderFunc`.

This design is intended as a workaround to allow us to fix this bug without
creating backwards compatibility issues for running tasks. A future version of
Nomad may introduce a read-only mount specifically for templates and artifacts
so that tasks cannot write into the same location that the Nomad agent is.

Fixes: #19888
Fixes: CVE-2024-1329
tgross added a commit to hashicorp/consul-template that referenced this issue Feb 8, 2024
Some consumers of `consul-template` use it like a library, where the application
runs the runner in-process. For projects like Nomad which need to run with a
high level of privilege, this is problematic in that its challenging to secure
the operations that read and write from disk without running the entirety of CT
as an external process (which carries a lot of overhead for Nomad workloads).

Add a `RendererFunc` and `ReaderFunc` interface to allow Nomad to inject a
sandboxed subprocess when reading from disk and writing to disk.

This implementation is currently being used in Nomad 1.7.4, 1.6.7, and
1.5.14 as a mitigation for hashicorp/nomad#19888. See
hashicorp/nomad@df86503
for example usage.
tgross added a commit to hashicorp/consul-template that referenced this issue Feb 8, 2024
Some consumers of `consul-template` use it like a library, where the application
runs the runner in-process. For projects like Nomad which need to run with a
high level of privilege, this is problematic in that its challenging to secure
the operations that read and write from disk without running the entirety of CT
as an external process (which carries a lot of overhead for Nomad workloads).

Add a `RendererFunc` and `ReaderFunc` interface to allow Nomad to inject a
sandboxed subprocess when reading from disk and writing to disk.

This implementation is currently being used in Nomad 1.7.4, 1.6.7, and
1.5.14 as a mitigation for hashicorp/nomad#19888. See
hashicorp/nomad@df86503
for example usage.
tgross added a commit to hashicorp/consul-template that referenced this issue Feb 8, 2024
Some consumers of `consul-template` use it like a library, where the application
runs the runner in-process. For projects like Nomad which need to run with a
high level of privilege, this is problematic in that its challenging to secure
the operations that read and write from disk without running the entirety of CT
as an external process (which carries a lot of overhead for Nomad workloads).

Add a `RendererFunc` and `ReaderFunc` interface to allow Nomad to inject a
sandboxed subprocess when reading from disk and writing to disk.

This implementation is currently being used in Nomad 1.7.4, 1.6.7, and
1.5.14 as a mitigation for hashicorp/nomad#19888. See
hashicorp/nomad@df86503
for example usage.
tgross added a commit to hashicorp/consul-template that referenced this issue Mar 4, 2024
Some consumers of `consul-template` use it like a library, where the application
runs the runner in-process. For projects like Nomad which need to run with a
high level of privilege, this is problematic in that its challenging to secure
the operations that read and write from disk without running the entirety of CT
as an external process (which carries a lot of overhead for Nomad workloads).

Add a `RendererFunc` and `ReaderFunc` interface to allow Nomad to inject a
sandboxed subprocess when reading from disk and writing to disk.

This implementation is currently being used in Nomad 1.7.4, 1.6.7, and
1.5.14 as a mitigation for hashicorp/nomad#19888. See
hashicorp/nomad@df86503
for example usage.
tgross added a commit to hashicorp/consul-template that referenced this issue Mar 5, 2024
Some consumers of `consul-template` use it like a library, where the application
runs the runner in-process. For projects like Nomad which need to run with a
high level of privilege, this is problematic in that its challenging to secure
the operations that read and write from disk without running the entirety of CT
as an external process (which carries a lot of overhead for Nomad workloads).

Add a `RendererFunc` and `ReaderFunc` interface to allow Nomad to inject a
sandboxed subprocess when reading from disk and writing to disk.

This implementation is currently being used in Nomad 1.7.4, 1.6.7, and
1.5.14 as a mitigation for hashicorp/nomad#19888. See
hashicorp/nomad@df86503
for example usage.
@tomqwpl
Copy link

tomqwpl commented May 14, 2024

On the face of it this may have caused the template sandboxing to leak access control list entries on the nomad.exe process.
For details of the issue I'm seeing see #20585
The issue goes away with the "disable_file_sandbox" option. At a guess the sandboxing code wasn't new, just used in a new place, but it appears that the sandboxing has the effect of adding access control list entries to the nomad.exe that then never get deleted. After a while the ACL gets too long so the process never subsequently succeeds, so no further jobs that use templating can then be run by nomad :-(

Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 28, 2024
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Projects
None yet
Development

No branches or pull requests

2 participants