I guess the idea is that an OsStr
is a platform-defined superset of valid UTF-8. This is an arbitrary byte string in the case of Unix, and UTF-8 + surrogate codepoints (also known as WTF-8) in the case of Windows.
If you have a str
, call str::encode_utf16()
to get the code units. If you have an OsStr
on Windows, call OsStrExt::encode_wide()
to get the code units. Collect either of these iterators into your fixed-size working buffer of choice.
In fact, RFC 2295 (os_str_pattern
) aims to extend the API surface of OsStr
, moving its Windows representation from WTF-8 to OMG-WTF-8 in the process. Incidentally, a recent issue regarding OsStr
notes the same thing that you do, that &str
to &OsStr
must always work.
Yes, WTF-8 is a strict superset of UTF-8, so all valid UTF-8 byte sequences are also valid WTF-8.
You know, you might be right there. Slice::from_str()
depends on Slice
and Wtf8
having the same layout: