How to print PathBuf with non UTF8 symbols


#1

Hello. Using Rust at windows 10. I want to print PathBuf, but as far, as i understand, it contains utf16 symbols, that can’t be displayed. How can i print full path?


#2

You can use the Path::display method.


#3

Using display leads to truncated (without non utf8 symbols) output in windows cmd. Is that .display problem or smth else?


#4

That’s intentional. If your path contains non-Unicode characters and you want to access those, you’ll have to convert it to an OsString and then use the platform-dependent methods to get the raw u16 values.


#5

UTF-8/UTF-16 distinction is not a problem. Every UTF-16 path can be losslessly converted to UTF-8 and correctly displayed as UTF-8.

Instead of .display(), you can also try .to_str() and see if it succeeds.

The only case where it couldn’t be displayed is when the path contained “unpaired surrogates” which are invalid in UTF-16 and don’t map to any printable character. NTFS theoretically allows this, but this is a very rare case that you may never ever run into.

In practice it’s 99% likely that your terminal (cmd.exe?) uses legacy code page (e.g. some MS-DOS character encoding) and can’t display any Unicode encoding.

Try configuring your terminal window to support UTF-8, or use a different terminal.


#6

The issue with the windows console (conhost.exe) is that it has very poor unicode font support and so cannot actually render most unicode. Doesn’t matter what encoding it is configured to use. It’ll still internally represent the unicode correctly of course, but you just won’t be able to actually see the characters that the font does not support.