libpathrs was designed alongside openat2(2) (available since Linux 5.6)
and dynamically tries to use the latest kernel features to provide the maximum
possible protection against racing attackers. However, it also provides support
for older kernel versions (in theory up to Linux 2.6.39 but we do not currently
test this) by emulating newer kernel features in userspace.
However, we strongly recommend you use at least Linux 5.6 to get a reasonable amount of protection against various attacks, and ideally at least Linux 6.8 to make use of all of the protections we have implemented.
See the following table for what kernel features we optionally support and what they are used for.
| Feature | Minimum Kernel Version | Description | Fallback |
|---|---|---|---|
/proc/thread-self |
Linux 3.17 (2014-10-05) | Used when operating on the current thread's /proc directory for use with PATHRS_PROC_THREAD_SELF. |
/proc/self/task/$tid is used, but this might not be available in some edge cases so /proc/self is used as a final fallback. |
open_tree(2) |
Linux 5.2 (2019-07-07) | Used to create a private procfs handle when operating on /proc (this is a copy of the host /proc -- in most cases this will also strip any overmounts). Requires CAP_SYS_ADMIN privileges. |
Open a regular handle to /proc. This can lead to certain race attacks if the attacker can dynamically create mounts. |
fsopen(2) |
Linux 5.2 (2019-07-07) | Used to create a private procfs handle when operating on /proc (with a completely fresh copy of /proc -- in some cases this operation will fail if there are locked overmounts on top of /proc). Requires CAP_SYS_ADMIN privileges. |
Try to use open_tree(2) instead -- in the case of errors due to locked overmounts, open_tree(2) will be used to create a recursive copy that preserves the overmounts. This means that an attacker would not be able to actively change the mounts on top of /proc but there might be some overmounts that libpathrs will detect (and reject). |
openat2(2) |
Linux 5.6 (2020-03-29) | In-kernel restrictions of path lookup. This is used extensively by libpathrs to safely do path lookups. |
Userspace emulated path lookups. |
subset=pid |
Linux 5.8 (2020-08-02) | Allows for a procfs handle created with fsopen(2) to not contain any global procfs files that would be dangerous for an attacker to write to. Detached procfs mounts with subset=pid are deemed safe(r) to leak into containers and so libpathrs will internally cache subset=pid ProcfsHandles. |
libpathrs's ProcfsHandles will have global files and thus libpathrs will not cache a copy of the file descriptor for each operation (possibly causing substantially higher syscall usage as a result -- our testing found that this can have a performance impact in some cases). |
STATX_MNT_ID |
Linux 5.8 (2020-08-02) | Used to verify whether there are bind-mounts on top of /proc that could result in insecure operations (on systems with fsopen(2) or open_tree(2) this protection is somewhat redundant for privileged programs -- those kinds of procfs handles will typically not have overmounts.) |
Parse the /proc/thread-self/fdinfo/$fd directly -- for systems with openat2(2), this is guaranteed to be safe against attacks. For systems without openat2(2), we have to fallback to unsafe opens that could be fooled by bind-mounts -- however, we believe that exploitation of this would be difficult in practice (even with an attacker that has the persistent ability to mount to arbitrary paths) due to the way we verify procfs accesses. |
ino field in fdinfo |
Linux 5.14 (2021-08-29) | Used to harden the emulation of RESOLVE_NO_XDEV withfdinfo if openat2(2) and STATX_MNT_ID are blocked or otherwise unavailable. |
None. We do still check the mnt_id field but because the value is static and attacker-known, in princple an attacker could (via a somewhat complicated attack) overmount fake fdinfo files to hide a mountpoint from libpathrs. |
STATX_MNT_ID_UNIQUE |
Linux 6.8 (2024-03-10) | Used for the same reason as STATX_MNT_ID, but allows us to protect against mount ID recycling. This is effectively a safer version of STATX_MNT_ID. |
STATX_MNT_ID is used (see the STATX_MNT_ID fallback if it's not available either). |
For more information about the work behind openat2(2), you can read the
following LWN articles (note that the merged version of openat2(2) is
different to the version described by LWN):