Is The Representation of `&str` Guaranteed?

Right now I'm doing some quick-and-dirty work, and since I got it working for now I'm fine with that and I'll find a better way to do it later, but I'm curious whether or not I can rely on the memory representation of &str.

I've found the the &str is represented as a memory address followed by a length. Is there any reason this would ever be different?


For some context, I'm probing the memory of a #![no_std] WASM module I compiled from rust and I want to get value of a static string I created in the WASM module like so:

#[no_mangle]
static MY_STR: &str = "hello world";

When compiled to WASM, Rust creates a global MY_STR that actually contains a pointer to the &str pointer. Aka., in Rust terms, it's a &&str. So to get the actual value out of WASM memory I folow the &&str to where the &str is in WASM memory, then I read the second 4 bytes and interpret that as the length and the first 4 bytes and interpret that as a pointer to the actual string data in memory.

1 Like

Here is the Unsafe Code Guidelines' page about pointer layout. It says at the top:

Disclaimer: Everything this section says about pointers to dynamically sized types represents the consensus from issue #16, but has not been stabilized through an RFC. As such, this is preliminary information.

So it sounds like the layout of &str is not guaranteed at this point.

4 Likes

OK, so in my use-case I can either:

  • Take the risk of &str repr changing
  • Or create a WASM function that can return the ptr and the len separately so that I don't have to rely on the representation

For now I'll take the risk as it won't cause any safety or memory problems for me in the context in which I'm using it ( because I'm safely just looking into the WASM memory ), it would just result in an error if the repr changed.

If you’re using the nightly compiler, ptr::to_raw_parts will do this.

3 Likes

Why the need for nightly? slice::from_raw_parts() with str::from_utf8_unchecked() will do.

2 Likes

I actually want to compile a WASM module that includes a static that is a &str, and then probe the memory of the WASM module to find the &str pointer and the parse that pointer. If I had a &[u8] instead of a &str I would still have to depend on the layout of the &[u8] pointer.

You can use the as_ptr() method for that.

Yeah, the issue is that I'm probing the memory of the WASM module after it's been compiled, and I wanted to avoid having to call an exposed WASM function to get the info I needed, so I was trying to use a static to make it accessible without any functions calls. So if I were to use as_ptr() I would have to be able to do something like this:

static MY_STR: &str = "hello world";
static MY_STR_PTR: usize = MY_STR.as_ptr() as usize;
static MY_STR_LEN: usize = MY_STR.len();

But that doesn't work, because you aren't allowed to put pointers, which have a size dependent on the architecture, inside of a static, because the static size would be different on different architectures ( or you need a nightly feature anyway ).


So in pretty much any other case, that as_ptr() and friends would work great, but I was trying to avoid function calls, so I ended up wanting to know the layout of &str "at rest" so to speak.

You can handle this by creating a wrapper struct (playground):

#[repr(C)]
pub struct Str {
    ptr: *const u8,
    len: usize,
}
unsafe impl Send for Str {}
unsafe impl Sync for Str {}
impl Str {
    const fn new(s: &'static str) -> Self {
        Self {
            ptr: s.as_ptr(),
            len: s.len(),
        }
    }
}

static MY_STR: Str = Str::new("hello world");
5 Likes

Ah, now that works! Thanks!

I tried to do something similar, but couldn't get it to work. That's great. :+1:

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.