A `&'static str` version of `std::path::MAIN_SEPARATOR`?


std::path::MAIN_SEPARATOR is a char. I need a &'static str version of it in stable Rust.

There is std::sys::path::MAIN_SEP_STR but it is private.

Please advise, thanks.


Can you be a little bit more specific about why you need it?


Sure, see SEP_STR and dirname in https://github.com/danielpclark/faster_path/pull/132/files


To my untrained eye, your dirname() function seems to assume that “/” is a valid root directory, which AFAIK is only guaranteed to be true on Unices. If that is correct, then I think hardcoding SEP_STR to “/” in your code would not cause a loss of portability.


You can do this with some unsafe code:

use std::path;

static SEP: char = path::MAIN_SEPARATOR;

fn sep_as_slice() -> &'static str {
    unsafe {
        std::mem::transmute(std::slice::from_raw_parts(&SEP as *const _, 1))

lazy_static is probably another option.


I filed rust-lang/rust#46712 to make MAIN_SEPARATOR.as_ref() or something like that give you a &'static str. For now I agree with @vitalyd, go with their unsafe code or lazy_static.


Beware that this assumes little-endian byte order!

And in general, only ASCII characters (< 0x80) can work directly as UTF-8.


How so?


In little-endian, '/' is 2F 00 00 00, but in big-endian it’s 00 00 00 2F. So directly casting a pointer to the char will get you "\0" on big-endian targets. You could offset the pointer to get the little byte though.


I get the bit layout part. How many bytes will a read through a &str do? I thought it would be 4, but I guess that’s either wrong or you’re referring to pulling bytes out of the &str individually being wrong?


It will read only the length of the slice, which you told it was 1.


I told it 1 of type T, which is char here. I was going by the assumption (which sounds like it’d be wrong) that if I have the loader place a static SEP in memory (in whatever endianness the machine uses), with it being of type char (4 bytes), then I can form a slice to it and read data through it.


But then the transmute changes the type, without changing the contents. So that fat (ptr, 1) which was (*const char, usize) is now considered (*const u8, usize), a UTF-8 byte slice of length 1.

If it did read 4 bytes in the str, you’d get either "/\0\0\0" or "\0\0\0/" depending on endianness.


Right, ok - I see what you mean.

Presumably adjusting the slice length to account for char->byte views would work irrespective of endianness?


This works on big-endian:

use std::path;

static SEP: char = path::MAIN_SEPARATOR;

fn sep_as_slice() -> &'static str {
    unsafe {
        let bytes = (&SEP as *const _ as *const u8).offset(3);
        std::mem::transmute(std::slice::from_raw_parts(bytes, 1))

You could alternate this with #[cfg(target_endian = "...")].


Right, you could do that to get back to a single-byte [u8] for BE. But can the following work for either endianness:

fn sep_as_slice() -> &'static str {
    unsafe {
        std::mem::transmute(std::slice::from_raw_parts(&SEP as *const _ as *const u8, ::std::mem::size_of::<char>()))


No, that’s what I said will get either "/\0\0\0" or "\0\0\0/". Those 0 bytes are each valid UTF-8.


Ok, understood (I didn’t see that part above). For some (now seemingly) silly reason I thought going through a slice would do a “fixup”, but that’s a thinko on my part :slight_smile:.


Maybe a dumb question, but do we have a usability problem if “I need the path to the root directory” devolves into discussions of endianness and unsafety?

Isn’t there a path:: filesystem_root or something?


That would need to be context-dependent since some popular operating systems (cough cough Windows) do not have a single user-visible filesystem root, but rather one root per mounted logical drive.

But it does seem to me that so far, this thread demonstrated the need for a filesystem_root() function returning a string, more than the need for a path separator string.