VYPR
Low severityNVD Advisory· Published Mar 29, 2023· Updated Feb 12, 2025

rootless: `/sys/fs/cgroup` is writable when cgroupns isn't unshared in runc

CVE-2023-25809

Description

runc is a CLI tool for spawning and running containers according to the OCI specification. In affected versions it was found that rootless runc makes /sys/fs/cgroup writable in following conditons: 1. when runc is executed inside the user namespace, and the config.json does not specify the cgroup namespace to be unshared (e.g.., (docker|podman|nerdctl) run --cgroupns=host, with Rootless Docker/Podman/nerdctl) or 2. when runc is executed outside the user namespace, and /sys is mounted with rbind, ro (e.g., runc spec --rootless; this condition is very rare). A container may gain the write access to user-owned cgroup hierarchy /sys/fs/cgroup/user.slice/... on the host . Other users's cgroup hierarchies are not affected. Users are advised to upgrade to version 1.1.5. Users unable to upgrade may unshare the cgroup namespace ((docker|podman|nerdctl) run --cgroupns=private). This is the default behavior of Docker/Podman/nerdctl on cgroup v2 hosts. or add /sys/fs/cgroup to maskedPaths.

AI Insight

LLM-synthesized narrative grounded in this CVE's description and references.

Rootless runc improperly makes /sys/fs/cgroup writable when cgroup namespace is not unshared, allowing containers to write to host user cgroup hierarchy.

Vulnerability

CVE-2023-25809 in runc, a CLI tool for spawning and running OCI containers, makes /sys/fs/cgroup writable in rootless mode under specific conditions. The bug occurs when runc is executed inside a user namespace but the cgroup namespace is not unshared (e.g., --cgroupns=host with rootless Docker/Podman/nerdctl), or, in rare cases, when runc is run outside a user namespace with /sys mounted as rbind, ro. The root cause is that runc falls back to bind-mounting the host's cgroup hierarchy instead of properly creating a cgroup namespace, leaving the cgroup filesystem writable [2][4].

Exploitation

An attacker must have the ability to run a container with rootless runc and either explicitly set --cgroupns=host or use a configuration that does not unshare the cgroup namespace. The attack does not require additional privileges within the container. The exploit allows the container to write to the user-owned cgroup hierarchy at /sys/fs/cgroup/user.slice/... on the host [2][4].

Impact

A container with such write access can modify cgroup settings (e.g., adjust resource limits) for processes within the same user slice, potentially enabling denial of service or resource exhaustion attacks. Other users' cgroup hierarchies remain unaffected [2][4].

Mitigation

Upgrade to runc version 1.1.5, which includes a fix that ensures the cgroup filesystem is not made writable in these scenarios [3]. For users unable to upgrade, workarounds include explicitly unsharing the cgroup namespace (--cgroupns=private, the default on cgroup v2 hosts) or adding /sys/fs/cgroup to the container's maskedPaths [2][4].

AI Insight generated on May 20, 2026. Synthesized from this CVE's description and the cited reference URLs; citations are validated against the source bundle.

Affected packages

Versions sourced from the GitHub Security Advisory.

PackageAffected versionsPatched versions
github.com/opencontainers/runcGo
< 1.1.51.1.5

Affected products

68

Patches

1
0d62b950e60f

Merge pull request from GHSA-m8cg-xc2p-r3fc

https://github.com/opencontainers/runcQiang HuangMar 29, 2023via ghsa
2 files changed · +51 19
  • libcontainer/rootfs_linux.go+34 19 modified
    @@ -306,26 +306,41 @@ func mountCgroupV2(m *configs.Mount, c *mountConfig) error {
     	if err := os.MkdirAll(dest, 0o755); err != nil {
     		return err
     	}
    -	return utils.WithProcfd(c.root, m.Destination, func(procfd string) error {
    -		if err := mount(m.Source, m.Destination, procfd, "cgroup2", uintptr(m.Flags), m.Data); err != nil {
    -			// when we are in UserNS but CgroupNS is not unshared, we cannot mount cgroup2 (#2158)
    -			if errors.Is(err, unix.EPERM) || errors.Is(err, unix.EBUSY) {
    -				src := fs2.UnifiedMountpoint
    -				if c.cgroupns && c.cgroup2Path != "" {
    -					// Emulate cgroupns by bind-mounting
    -					// the container cgroup path rather than
    -					// the whole /sys/fs/cgroup.
    -					src = c.cgroup2Path
    -				}
    -				err = mount(src, m.Destination, procfd, "", uintptr(m.Flags)|unix.MS_BIND, "")
    -				if c.rootlessCgroups && errors.Is(err, unix.ENOENT) {
    -					err = nil
    -				}
    -			}
    -			return err
    -		}
    -		return nil
    +	err = utils.WithProcfd(c.root, m.Destination, func(procfd string) error {
    +		return mount(m.Source, m.Destination, procfd, "cgroup2", uintptr(m.Flags), m.Data)
     	})
    +	if err == nil || !(errors.Is(err, unix.EPERM) || errors.Is(err, unix.EBUSY)) {
    +		return err
    +	}
    +
    +	// When we are in UserNS but CgroupNS is not unshared, we cannot mount
    +	// cgroup2 (#2158), so fall back to bind mount.
    +	bindM := &configs.Mount{
    +		Device:           "bind",
    +		Source:           fs2.UnifiedMountpoint,
    +		Destination:      m.Destination,
    +		Flags:            unix.MS_BIND | m.Flags,
    +		PropagationFlags: m.PropagationFlags,
    +	}
    +	if c.cgroupns && c.cgroup2Path != "" {
    +		// Emulate cgroupns by bind-mounting the container cgroup path
    +		// rather than the whole /sys/fs/cgroup.
    +		bindM.Source = c.cgroup2Path
    +	}
    +	// mountToRootfs() handles remounting for MS_RDONLY.
    +	// No need to set c.fd here, because mountToRootfs() calls utils.WithProcfd() by itself in mountPropagate().
    +	err = mountToRootfs(bindM, c)
    +	if c.rootlessCgroups && errors.Is(err, unix.ENOENT) {
    +		// ENOENT (for `src = c.cgroup2Path`) happens when rootless runc is being executed
    +		// outside the userns+mountns.
    +		//
    +		// Mask `/sys/fs/cgroup` to ensure it is read-only, even when `/sys` is mounted
    +		// with `rbind,ro` (`runc spec --rootless` produces `rbind,ro` for `/sys`).
    +		err = utils.WithProcfd(c.root, m.Destination, func(procfd string) error {
    +			return maskPath(procfd, c.label)
    +		})
    +	}
    +	return err
     }
     
     func doTmpfsCopyUp(m *configs.Mount, rootfs, mountLabel string) (Err error) {
    
  • tests/integration/mounts.bats+17 0 modified
    @@ -63,3 +63,20 @@ function teardown() {
     	runc run test_busybox
     	[ "$status" -eq 0 ]
     }
    +
    +# https://github.com/opencontainers/runc/security/advisories/GHSA-m8cg-xc2p-r3fc
    +@test "runc run [ro /sys/fs/cgroup mount]" {
    +	# With cgroup namespace
    +	update_config '.process.args |= ["sh", "-euc", "for f in `grep /sys/fs/cgroup /proc/mounts | awk \"{print \\\\$2}\"| uniq`; do grep -w $f /proc/mounts | tail -n1; done"]'
    +	runc run test_busybox
    +	[ "$status" -eq 0 ]
    +	[ "${#lines[@]}" -ne 0 ]
    +	for line in "${lines[@]}"; do [[ "${line}" == *'ro,'* ]]; done
    +
    +	# Without cgroup namespace
    +	update_config '.linux.namespaces -= [{"type": "cgroup"}]'
    +	runc run test_busybox
    +	[ "$status" -eq 0 ]
    +	[ "${#lines[@]}" -ne 0 ]
    +	for line in "${lines[@]}"; do [[ "${line}" == *'ro,'* ]]; done
    +}
    

Vulnerability mechanics

Generated on May 9, 2026. Inputs: CWE entries + fix-commit diffs from this CVE's patches. Citations validated against bundle.

References

4

News mentions

0

No linked articles in our index yet.