Skip to content

std::PathBuff::join("/some/path") overrides the original path in the resulting PathBuf #59726

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
aruiz opened this issue Apr 5, 2019 · 10 comments
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@aruiz
Copy link

aruiz commented Apr 5, 2019

When join's argument start's with "/" then join overrides any path present in &self making this code valid:

let foo = PathBuf::new("/hello");
assert_eq!(foo.join("/world"), PathBuf::new("/world));

This is not documented and generates a lot of confusion.

@jonas-schievink
Copy link
Contributor

It is documented: https://doc.rust-lang.org/std/path/struct.PathBuf.html#method.push

(Path::join refers to PathBuf::push)

@sfackler sfackler closed this as completed Apr 5, 2019
@sfackler sfackler reopened this Apr 5, 2019
@Proximyst
Copy link

Would it not be possible to add something such as PathBuf::join_safely which would handle such cases? This'd make it easier to avoid issues with files such as with webservers

@tesuji
Copy link
Contributor

tesuji commented Apr 6, 2019

@Proximyst You could use Path::is_absolute to check if the path is independent of the current directory.

@jonas-schievink jonas-schievink added C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Apr 21, 2019
@adumbidiot
Copy link

I’d also like to add that this has resulted in a security issue in tower-http and warp, and the latter isn’t fixed yet. is_absolute isn’t good enough as the path is also replaced if the segment contains a prefix, and there doesn’t seem to be an easy way to check for that.

@ChrisDenton ChrisDenton added the A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` label Jan 26, 2023
@TheAlgorythm
Copy link

I've implemented a crate to mitigate path traversal vulnerabilities: https://crates.io/crates/path_ratchet/ (it currently doesn't support join just push)
Maybe something like this can be added to std. What I like on my approach is that it's on the type level and therefore harder to misuse.

@NickKhalow
Copy link

NickKhalow commented Mar 28, 2025

I support the initiative to add join_safely. Behaviour of join is missleading, because the caller doesn't expect the method will overwrite the whole content of the path instead of joining the new part to it (it's called join after all!)

@hanna-kruppe
Copy link
Contributor

Just preventing the rhs of join/push from overwriting the whole path is insufficient for preventing path traversal vulnerabilities. path_ratchet goes further by also disallowing any .. components, but is both overly restrictive for some use cases (doesn't allow all relative paths that stay within the intended root directory) and insufficient in some scenarios (e.g., symlinks pointing to a path outside the intended root directory). Also note that:

  • The issue in tower-http / warp wasn't due to a programmer misunderstanding the join behavior -- the code intentionally tried to sanitize the path components because the developer was aware of join's behavior, but missed some Windows-specific cases. IMO the main lesson here is that sanitizing paths properly is hard.
  • Just adding a new method with a longer and less obvious name won't help, if join was considered a footgun that needs replacement it would also have to be deprecated. That would obviously lead to a lot of churn, and what's the replacement for the use cases that do want the "rhs can override lhs entirely" behavior (e.g., for paths that may be either absolute or relative to an implied root that's not the CWD)?

I agree that path traversal vulnerabilities are unnecessarily hard to prevent, but nothing proposed in this thread looks like a slam-dunk solution to me. Something closer to the os.Root API Go is introducing seems both more usable and more robust. cc #120426 for a related but different proposal for Rust.

@adumbidiot
Copy link

adumbidiot commented Mar 28, 2025

I agree that join, shouldn't be deprecated or replaced.

I agree that path traversal vulnerabilities are unnecessarily hard to prevent, but nothing proposed in this thread looks like a slam-dunk solution to me. Something closer to the os.Root API Go is introducing seems both more usable and more robust. cc #120426 for a related but different proposal for Rust.

Also see cap-std for a Rust library that implements a similar pattern.
However, I disagree that nothing should be done about join, especially its behavior around prefixes (which caused the warp/tower-http bugs). Here's a small example of the issue.

Consider the following example, on Windows:

use std::path::Path;

fn main() {
    let base_path = Path::new("/static/serve/dir");
    let mut final_path = base_path.to_path_buf();
    let path_components = Path::new( "c:/a/b/c");
    final_path.push(path_components);
    println!("{:?}", final_path);
}

Unsurprisingly, it results in "c:/a/b/c", replacing the entire path. According to the documentation for join "If path is absolute, it replaces the current path.". Alright, so would rejecting absolute paths be enough? For this path, yes.

use std::path::Path;

fn main() {
    let base_path = Path::new("/static/serve/dir");
    let mut final_path = base_path.to_path_buf();
    let path_components = Path::new("c:/a/b/c");
    assert!(!path_components.is_absolute(), "bad path");
    final_path.push(path_components);
    println!("{:?}", final_path);
}

This will trigger the assert. But what about a non-absolute prefix path?

use std::path::Path;

fn main() {
    let base_path = Path::new("/static/serve/dir");
    let mut final_path = base_path.to_path_buf();
    let path_components = Path::new("c:a/b/c");
    assert!(!path_components.is_absolute(), "bad path");
    final_path.push(path_components);
    println!("{:?}", final_path);
}

Surprisingly (at least to me), this works, resulting in "c:a/b/c". Why was the entire path replaced? If you look at the documentation for join again, it says nothing directly. However, you can see that it links to PathBuf::push for more about path joining behavior, which says "if path has a prefix but no root, it replaces self.". I think this behavior isn't immediately obvious from the documentation. Also note that if join did nothing here and just added the raw path, there wouldn't be a problem: the resulting path would be something like "/static/serve/dir/c:a/b/c" which would fail if you tried to use it for anything due to the ":" char, which is invalid for Windows paths.

So how could this be fixed? Looking at the source code for PathBuf::push, you can see that it calls uses path.is_absolute() || path.prefix().is_some() to determine whether to replace that path. However, the path.prefix() function is not public. To my knowledge, the only way to make this work is to create a component iterator from the path and check if the first is a prefix:

use std::path::Path;

fn main() {
    let base_path = Path::new("/static/serve/dir");
    let mut final_path = base_path.to_path_buf();
    let path_components = Path::new("c:a/b/c");
    assert!(!path_components.is_absolute(), "bad path");
     if matches!(
        path_components.components().next(),
        Some(std::path::Component::Prefix(_))
    ) {
        panic!("bad path");
    }
    final_path.push(path_components);
    println!("{:?}", final_path);
}

This will reject the relative prefix path.

I think the situation around join can be improved by making it more obvious that it can replace the entire path and under what cases it will do so in the documentation. I also think a new method to get the prefix of a path should be added or maybe a method that determines if pushing a path segment will cause the original path to be replaced. A new method that joins while never replacing the original path would also be nice. While path sanitization is hard, I believe that this particular issue stems from the join method itself. And while the example here obviously isn't sufficient for a path sanitization setup, I think this method should try to allow users to create these setups as easily as possible instead of forcing them to discover and fight its behavior.

Oops, like I used push instead of join in the examples, but the the point stands.

@TheAlgorythm
Copy link

TheAlgorythm commented Mar 29, 2025

There are two different problems which shouldn't be conflated:

  • "Classic" Path Traversal: This is based on the semantics of paths and prevented by something like path_ratchet.
  • FS misuse: Based on fs features like symlinks. These are not path traversal attacks and it depends on the system, threat model and so on whether this is an unwanted or wanted behaviour. It is prevented by something like cap-std and depends on the OS and the existence of the directory/file. This is harder to get right because of TOCTOU.

Both solutions have their use case.
Ideally both build on the same API where path traversal is always (unless opt-outed) prevented and the misuse resistance depends on the fs method and uses the Informations from the first step.

path_ratchet goes further by also disallowing any .. components, but is both overly restrictive for some use cases (doesn't allow all relative paths that stay within the intended root directory)

@hanna-kruppe Do you mean something like a/../a/b?

the code intentionally tried to sanitize the path components because the developer was aware of join's behavior, but missed some Windows-specific cases. IMO the main lesson here is that sanitizing paths properly is hard.

👍 This is the reason why sanitisation/validation is bad and the principal "Parse, don’t validate" should be applied.
This means using the existing parser of Path and check it's structure like I've did it with path_ratchet.

@hanna-kruppe
Copy link
Contributor

This issue is a bit all over the place now. I see at least three separate actionable ideas floating around that one could pursue:

  • Tweak the documentation of join to include more information (e.g., Path::join only mentions absolute paths and to get the full story one has to follow the link to PathBuf::push). Anyone could just submit a PR for this.
  • Expose a new API tailored for the "will pushing/joining this onto a path override it" -- this seems like a small, uncontroversial addition to me if it's true that it exists internally and is also needed outside the standard library, but isn't exposed currently. This could be a small API change proposal.
  • Larger API additions that provide higher level solutions for some of the problems around path traversal / sandboxing / TOCTOU (I agree there are multiple that probably need distinct solutions). This could also be API change proposal(s) but since it's a wide open design space and it overlaps with the ongoing work on directory handles (Tracking Issue for directory handles #120426), this may be a more challenging process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-io Area: `std::io`, `std::fs`, `std::net` and `std::path` C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

10 participants