Rationale behind replacing paths while joining?

dimanne · December 22, 2023, 9:39am

Apparently, when I join two absolute paths, instead of (expected) joining, one of them is completely replaced. This is from the documentation of Path join:

Creates an owned [PathBuf] with path adjoined to self.
If path is absolute, it replaces the current path.
assert_eq!(Path::new("/etc").join("/bin/sh"), PathBuf::from("/bin/sh"));

Which is super counter-intuitive: for example, if I type cd /etc/bin/asdf, I will go to /etc/bin/asdf, not to /bin/asdf, but Path::join would produce /bin/asdf (!).

It seems that this is a common source of confusion:

Surprising behaviour of Path.join
and this from SO: Why does joining paths completely replace the original path in Rust?
I also know that C++ has this (broken, imho) behaviour too.
From github: Path::join should concat paths even if the second path is absolute

My main questions:

What was the rationale/motivation behind implementing this behaviour for join? Why does it replace paths, when one of them is absolute?
- If I were designing this, I would say that it should behave just like regular string concatenation, omitting repetitive slashes. So, if one path ends with / and the other starts with /, it should omit one of them. This would match perfectly well to how terminal/bash works.
Is there anything that could fix the issue? Shall I just create my own join function (which would remove the leading / from the 2nd path and use join)?..

P.S.

Looks like for an answer to the 2nd question I can use strip_prefix (?):

    /// let path = Path::new("/test/haha/foo.txt");
    ///
    /// assert_eq!(path.strip_prefix("/"), Ok(Path::new("test/haha/foo.txt")));

GeniusIsme · December 22, 2023, 9:51am

It probably would be better to have separate type for absolute paths.
But now, join will always have broken behavior. Rust chose one which was already used in other languages.

DanielKeep · December 22, 2023, 10:00am

I'm not really sure how you arrived at your shell example. Path::new("/etc").join("/usr/bin") would actually be equivalent to:

/etc $ cd /usr/bin

Which would, as you should expect, try to change the directory to /usr/bin and not /etc/usr/bin. If you wanted to change to /etc/usr/bin, you'd use cd usr/bin. That is, you'd use a relative path without a leading slash.

This is, in my opinion, a completely reasonable design that is entirely consistent with how every single shell and operating system I've ever used works.

dimanne · December 22, 2023, 10:06am

Oh... I think now I at least understand how it could be introduced this way...

It looks like you can imagine joining of p1 and p2 as:

cd $p1
cd $p2

This is certainly not what comes into my mind first

I arrived at my example like this:

What would shell do if I wanted to cd $p1/$p2 <--- I personally think that this closer matches "joining" paths, but thank you for explanation. I think it answers my first question.

DanielKeep · December 22, 2023, 10:11am

Well, if you applied that to your earlier example, you'd end up with /etc//usr/bin, which wouldn't be a valid path.

I've seen, written, and laboriously debugged enough path-handling code over the years that I actively resist treating paths as strings. That's why when I think about path operations, I think about filesystem operations, not string operations.

dimanne · December 22, 2023, 10:17am

Yep... Hence Path::join, which should do a bit more that just bare string1 + "/" + string2.

jdahlstrom · December 22, 2023, 10:34am

The behavior is useful because a caller (or config file or whatever) can choose whether it wants to use a relative or absolute path, and the callee can then simply absolutize it by adding its own prefix and the absolute path is unaffected which is probably what the caller wanted. The callee doesn't have to separately check whether the path is absolute or not.

One could also think of an absolute path as a special type of relative path where the starting / is understood to stand for as many ../ components as needed to reach the root.

dimanne · December 22, 2023, 10:43am

Yeah, agree, 100%. There are cases when current behaviour is useful. It would be nice not to have this behaviour for the function that is called join (if you join A and B, you never expect A to completely go away / vanish)... Or maybe at least have another standard function that does it differently.

khimru · December 22, 2023, 12:47pm

But what exactly should it do when you join C:\WINDOWS and E:\MyHome ?

dimanne · December 22, 2023, 7:32pm

But what exactly should it do when you join C:\WINDOWS and E:\MyHome

~~It should suggest changing OS.~~

To be honest, I do not know. Maybe for Windows there should be some Windows-specific API?.. Hard to say. But having "broken" join() for the rest of the world is not good either.

khimru · December 22, 2023, 9:03pm

Linux is not “the rest of the world”. VxWorx is supported by Rust and uses the same path format as Windows.

And on POSIX-compliant system one may have //vol/path which is distinct from /vol/path (although, here I'm not sure whether anything but CygWin uses that… but this still would imply that there would be yet-another separate path name style with it's own separate path name handling).

That's the typical answer to “why Rust (or a crate… or a program…) does some weird thing?” question: it's usually weird to you because you don't know about many things that exist in that world.

Sometimes developers decide that some special kind of weirdness is just too much weird and doesn't deserve special treatment (e.g. you may Wallmart and buy a device with 24bit CPU but it was deemed to weird to be supported by Rust (and we may forget about historical devices with one's-complement or 36bit words.

But Windows, VxWorks… these are too popular to be ignored.

system · March 21, 2024, 9:04pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Surprising behaviour of Path.join	10	597	February 11, 2023
`PathBuf::join` does not fix slashes help	12	410	January 7, 2025
Misunderstanding of PathBuf:push help	8	147	June 25, 2025
How to get absolute path of PathBuf help	9	733	February 20, 2025
Best (efficient) way to join Paths	7	4532	January 18, 2024

Rationale behind replacing paths while joining?

My main questions:

Related topics