How can I convert an u64 into a native byte order &str?

Hi, guys! I'm new to Rust.

How can I convert an u64 into a native byte order &str?
I can get the [u8; 8] from 1000u64.to_ne_bytes(),
and I can get the &[u8] from 1000u64.to_ne_bytes().as_slice(),
but I have no idea how to convert the [u8; 8] / &[u8] into a native byte order &str.

Update: Since the [u8; 8] / &[u8] is already byte order stored, I guess I just need to conver the [u8; 8] / &[u8] into a corresponding &str as it's(same memory layout)?

And I guess I can't use std::str::from_utf8() since the &[u8] may not a valid UTF-8 string?

Basically, in the C/C++ env, I just need to use:

uint64_t x = 0x01020304_05060708;
const char *p = (const char *) &x; // Unsafe, but native byte order in nature

Anyone can help with this?

Can I just use the std::str::from_utf8_unchecked(1000u64.to_ne_bytes().as_slice()) ?

What's a native byte order string? Do you have an example in mind?

1 Like

I don't know about this, but since the &[u8] already byte order stored, I guess I just need to convert it as it's into a corresponding &str?

Do you mean you want "000000000000000f" (or "0f00000000000000") from 15? Or a sequence of bytes with 15 NULs? The former will involve allocating a String and the latter is problematic for non-ASCII byte values (as Rust strings are UTF8).

Maybe a better question is, what's your end goal?

(On mobile or I'd give code examples.)

1 Like

Do you mean you want "000000000000000f" (or "0f00000000000000" ) from 15 ?

It doesn't matter, the string should be lengthed 8 bytes, and for sure, it's not a UTF-8 string and will never be.

For example, in big-endian arch, it's 000000000000000f, 0f00000000000000 on little-endian. But it's always guaranteed to be 8 bytes long.

I managed to convert the &[u8] into a &[str] (used as if it's a byte array/slice).

pub fn main() {
    let name = {
        let a = (0 as u64).to_ne_bytes();
        //unsafe { std::str::from_utf8_unchecked(a.as_slice()) }
        //unsafe { std::mem::transmute::<&[u8], &str>(a.as_slice()) }
        unsafe { std::mem::transmute::<_, &str>(a.as_slice()) }
    };
    debug_assert_eq!(name.len(), 8);
    println!("OK");
}

I wonder if it's correct for the above code, especially for the std::mem::transmute part?

This code is completely unsafe. Please don't do transmutes like that.
In particular, in Rust having a &str that points to non UTF-8 data is UB.
I think you are thinking about C where you can pretend every char * is a string. Unfortunately, that won't work in Rust.
A String and a &str must be valid UTF-8.


Also, I am not entirely sure why you want to convert a &[u8] to a &str. They are not equivalent. What is it that you actually want to achieve?

10 Likes

Sounds like you want to store a sequence of bytes. For that [u8] (or [u8; 8]) is the most appropriate type. It literally means "a slice of bytes" (or "an array of bytes") and it sounds like that is what you are looking for.

8 Likes

If you have a 100% guarantee that your bytes will always be valid UTF-8, then you can use str::from_utf8_unchecked(). Otherwise, you can use str::from_utf8(), and expect() against invalid UTF-8. To illustrate (Rust Playground):

use std::str;

fn main() {
    let x: u64 = 0x01020304_05060708;
    let bytes = x.to_ne_bytes();
    let p = str::from_utf8(&bytes).expect("invalid UTF-8");
    println!("{p:?}");
}

You can alternatively call str::from_utf8(&x.to_ne_bytes()).expect(...).to_owned() to create a String and avoid the extra variable.

I think the confusion here is that C uses const char * to represent both a string (roughly &str) and an array of bytes (&[u8]). From what you've said, you don't actually want a UTF-8 string[1] you want the bytes of a u64 in their native byte order (&[u8] or [u8; 8]).

The direct equivalent of your C code would be using 1000_u6.4.to_ne_bytes() to get the bytes as a fixed-size array and then taking a reference to it.

The implementation of u64::to_ne_bytes() is effectively a std::reinterpret_cast<std::array<uint8_t, size_of(uint64_t)>>(), which is the same as your code except it passes things around by value.


  1. There's no such thing as byte order in UTF-8, having invalid UTF-8 in a &str is UB, using unsafe because "that's what I would do in C" is almost always the wrong thing to do as a newbie, etc. ↩ī¸Ž

3 Likes

One of the guarantees of the str type is that it represents a valid UTF-8 sequence. Constructing a value that does not do so violates the assumptions of that type and leads to undefined behaviour. In other words, if you can't guarantee that it's UTF-8 and can't accept a fallible conversion, then you can't have this.

Can you explain more about why it needs to be a &str specifically, and why a &[u8] (which would otherwise be a fairly normal way to represent "a sequence of bytes") would not be adequate for your purposes?

3 Likes

Hi, guys. I believe I misunderstand the use case.
I should stick with the [u8], instead of trying to convert it into &str, since it doesn't make any sense.
Thank you all!

6 Likes