std::path::MAIN_SEPARATOR
is a char
. I need a &'static str
version of it in stable Rust.
There is std::sys::path::MAIN_SEP_STR
but it is private.
Please advise, thanks.
std::path::MAIN_SEPARATOR
is a char
. I need a &'static str
version of it in stable Rust.
There is std::sys::path::MAIN_SEP_STR
but it is private.
Please advise, thanks.
Can you be a little bit more specific about why you need it?
Sure, see SEP_STR
and dirname
in https://github.com/danielpclark/faster_path/pull/132/files
To my untrained eye, your dirname() function seems to assume that "/" is a valid root directory, which AFAIK is only guaranteed to be true on Unices. If that is correct, then I think hardcoding SEP_STR to "/" in your code would not cause a loss of portability.
You can do this with some unsafe code:
use std::path;
static SEP: char = path::MAIN_SEPARATOR;
#[inline(always)]
fn sep_as_slice() -> &'static str {
unsafe {
std::mem::transmute(std::slice::from_raw_parts(&SEP as *const _, 1))
}
}
lazy_static
is probably another option.
I filed rust-lang/rust#46712 to make MAIN_SEPARATOR.as_ref()
or something like that give you a &'static str
. For now I agree with @vitalyd, go with their unsafe code or lazy_static.
Beware that this assumes little-endian byte order!
And in general, only ASCII characters (< 0x80
) can work directly as UTF-8.
How so?
In little-endian, '/'
is 2F 00 00 00
, but in big-endian it's 00 00 00 2F
. So directly casting a pointer to the char
will get you "\0"
on big-endian targets. You could offset the pointer to get the little byte though.
I get the bit layout part. How many bytes will a read through a &str
do? I thought it would be 4, but I guess that's either wrong or you're referring to pulling bytes out of the &str individually being wrong?
It will read only the length of the slice, which you told it was 1.
I told it 1 of type T
, which is char
here. I was going by the assumption (which sounds like it'd be wrong) that if I have the loader place a static SEP
in memory (in whatever endianness the machine uses), with it being of type char
(4 bytes), then I can form a slice to it and read data through it.
But then the transmute
changes the type, without changing the contents. So that fat (ptr, 1)
which was (*const char, usize)
is now considered (*const u8, usize)
, a UTF-8 byte slice of length 1.
If it did read 4 bytes in the str
, you'd get either "/\0\0\0"
or "\0\0\0/"
depending on endianness.
Right, ok - I see what you mean.
Presumably adjusting the slice length to account for char
->byte
views would work irrespective of endianness?
This works on big-endian:
use std::path;
static SEP: char = path::MAIN_SEPARATOR;
#[inline(always)]
fn sep_as_slice() -> &'static str {
unsafe {
let bytes = (&SEP as *const _ as *const u8).offset(3);
std::mem::transmute(std::slice::from_raw_parts(bytes, 1))
}
}
You could alternate this with #[cfg(target_endian = "...")]
.
Right, you could do that to get back to a single-byte [u8]
for BE. But can the following work for either endianness:
#[inline(always)]
fn sep_as_slice() -> &'static str {
unsafe {
std::mem::transmute(std::slice::from_raw_parts(&SEP as *const _ as *const u8, ::std::mem::size_of::<char>()))
}
}
No, that's what I said will get either "/\0\0\0"
or "\0\0\0/"
. Those 0
bytes are each valid UTF-8.
Ok, understood (I didn't see that part above). For some (now seemingly) silly reason I thought going through a slice would do a "fixup", but that's a thinko on my part .
Maybe a dumb question, but do we have a usability problem if "I need the path to the root directory" devolves into discussions of endianness and unsafety?
Isn't there a path:: filesystem_root
or something?
That would need to be context-dependent since some popular operating systems (cough cough Windows) do not have a single user-visible filesystem root, but rather one root per mounted logical drive.
But it does seem to me that so far, this thread demonstrated the need for a filesystem_root() function returning a string, more than the need for a path separator string.