Why doesn't `Path` implement `std::fmt::Display`?

Why do we have to do this

fn main() {
    let path = std::path::Path::new("/tmp/foo.rs");
    println!("{}", path.display());
}

instead of this?

fn main() {
    let path = std::path::Path::new("/tmp/foo.rs");
    println!("{path}");
}

Is this this intentional?

Error message:

error[E0277]: `Path` doesn't implement `std::fmt::Display`
 --> src/main.rs:3:15
  |
3 |     println!("{path}");
  |               ^^^^^^ `Path` cannot be formatted with the default formatter; call `.display()` on it
  |
  = help: the trait `std::fmt::Display` is not implemented for `Path`
  = note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead
  = note: call `.display()` or `.to_string_lossy()` to safely print paths, as they may contain non-Unicode data
  = note: this error originates in the macro `$crate::format_args_nl` which comes from the expansion of the macro `println` (in Nightly builds, run with -Z macro-backtrace for more info)

Yes, that is intentional. That's why there is a display method.

the docs explain it:

Returns an object that implements Display for safely printing paths that may contain non-Unicode data. This may perform lossy conversion, depending on the platform.

Paths don't need to be valid utf8, so some handling of that is required to print it. That's what the display function does for you.

Sure, but why don't perform that conversation implicitly in the Display trait?

To make it obvious in the API that it's lossy in some cases, which can be easy to overlook during development and unit testing.

3 Likes

Fair enough. Personally, that kind of forces me into writing {path:?} everywhere. Is it good or bad in your opinion?

Necessary. (search for “utf 8 vulnerability”)

Then when would you want to use .display()?

You should not be using {path:?} for any user-facing output, only for debugging. {:?} has no guarantees that its output is pretty or useful at all. Use display() instead.

println!("The path is ‘{}’.", path.display());

You gave us a use-case in the first post: to safely print a path.

To be fair, from the standpoint of the std design I still don't understand why not implement std::fmt::Display for Path. If that's a safe and prefered way for printing paths to the user I would much more prefer to write code like this

fn main() {
    let path = std::path::Path::new("/tmp/foo.rs");
    let author = String::from("Alice");
    println!("A {path} created by {author}");
}

instead of this

fn main() {
    let path = std::path::Path::new("/tmp/foo.rs");
    let author = String::from("Alice");
    println!("A {} created by {author}", path.display());
}

Why is this anything but an inconvenience? Genuine question.

Because 99% of time it's simply wrong idea to print a Path. Most apps simply don't need to do that at all.

They only receive normal UTF-8 string filenames and deal with them. And simply flat out reject non-UTF-8 pathnamed without trying to do anything with them.

But, of course, rare programs that do want to process arbitrary paths also need to print them, from time to time.

And for these apps display is provided.

P.S. I think the confusion comes from the assumption that file path is something “simple and safe, easy to use”. In an ideal world, where that's actually true having Display be implemented directly on paths would be logical. But in our world that's very defintely is not true. But Rust couldn't simply say “hey. guys. you should stop using these wrong and stupid non-UTF-8 compatible filenames” (Python3 tried to do that and then had to backtrack), it have to support both “nice” filenames and “awful” filenames, too. But for “nice” filenames you already have String, &str and all the related machinery…

1 Like

Paths may be non-UTF-8 and generally imperfect. And we have a special and safe way of handling this — the .display() method. I absolutely get it!

Buy why can't std implement std::fmt::Display?

impl Display for Path {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "{}", self.display())
    }
}

What's wrong with having a convenient way to print paths to the user if .display() is supposed to be safe and you are supposed to use it?

Path::display is lossy. If the path contains invalid UTF-8---which is absolutely legal---then Path::display will use the Unicode replacement codepoint:

use std::{os::unix::ffi::OsStrExt, ffi::OsStr, path::Path};

fn main() {
    let osstr = OsStr::from_bytes(b"foo\xFFbar");
    let path = Path::new(osstr);
    assert_eq!(path.display().to_string(), "foo\u{FFFD}bar");
    assert_eq!(path.display().to_string().as_bytes(), b"foo\xEF\xBF\xBDbar");
}

Playground: Rust Playground

This is explicitly documented right in the Path::display docs:

Returns an object that implements Display for safely printing paths that may contain non-Unicode data. This may perform lossy conversion, depending on the platform.

The "safe" part here means that you are guaranteed that the output is valid UTF-8. This is a requirement for implementations of the std::fmt::Display trait. That is what makes is "safe," although I think this is probably misleading terminology (particularly given that "safe" tends to have a more precise technical definition in the context of Rust programs). It is being used colloquially here to mean, "it is okay to use this output in a context that requires valid UTF-8 even though a Path itself may not be valid UTF-8."

But the output is still lossy. So if you print a path this way that contains invalid UTF-8 and that is then used somewhere else as an input file path to another program, then you'll get a different file path than what you started from.

So how do you print file paths "safely" in a non-lossy way? You kinda can't. On Unix at least, you can just write arbitrary bytes to file descriptors. So in order to do that, you need to write platform specific code. bstr provides some routines for doing that in a platform independent way with some costs that are incurred on Windows.

The point of the .display() method is to act as a speed bump: it's meant to get you to pause and question if what you're doing is actually correct. Moreover, because of the design of the Display and ToString traits, this speed bump means that you can't do path.to_string() (like you can for anything that implements the Display). If path.to_string() were possible, conventions would imply that to be a non-lossy conversion. But it can't be and that would be exceptionally misleading.

I will also note that I don't think there is unanimous agreement that this speed bump in theory is something we ought to have. But I do think there is likely unanimous agreement that we definitely want to avoid path.to_string() being possible. So even if you don't like the idea of the speed bump giving you pause, the fact that path.to_string() would be available if Path implemented std::fmt::Display means that speed bump is likely never going away.

14 Likes

And AFAIK also camino — Rust filesystem library // Lib.rs as a de-facto standard wrapper to get both "path semantics" and "UTF-8 guarantees".

2 Likes

Beautiful answer. Thank you! :crab:

2 Likes

Looks very interesting, I should give it a try!

This makes me curious: if I have such a file path that is not UTF-8, then add it as an argument on the command line, and then retrieve it with std::env::args(), which gives me a String for each argument, then what will that String corresponding to that file path contain?

Nothing — args() will panic. You need to use args_os() instead to handle that case.

3 Likes

https://doc.rust-lang.org/std/env/fn.args.html

Panics

The returned iterator will panic during iteration if any argument to the process is not valid Unicode. If this is not desired, use the args_os function instead.

Oh, good to know. On which platforms may this happen?