[solved: `&mut[u8]` is useful, `&mut str` isn’t + workarounds] Is it possible to create a mutable string literal?

let x: &mut str = "qwerty"; 

This doesn’t compiles.

error[E0308]: mismatched types
 --> src/lib.rs:2:23
  |
2 |     let x: &mut str = "qwerty";  // x is declared as `mut`
  |            --------   ^^^^^^^^ types differ in mutability
  |            |
  |            expected due to this
  |
  = note: expected mutable reference `&mut str`
                     found reference `&'static str`

If I really want a &mut str I can totally allocate a String or a Box and call it a day, but I don’t understand why there is this limitation. Is it just because "someone need to do the work"?

I would have expect the following construction to compile. The compiler would have allocated it on the stack (just like any other local variable) some space equal to the size of bytes of the string (it’s known at compile time), then memcpy it from the binary at the beginning of the function (in order to have a different mutable slice for each stack frame of the current function).

If x was instead a static, since there is a single instance of a given static in a program, there is no need to duplicate it with memcpy shenanigans like for local variable. If my memory of my assembly courses is right, this mean that x would be allocated in the bss section, and it should just work (I don’t remember if bss is writable), otherwise we just have to put it in any other writable section of our binary.

Yes, you can, but it would be useless and silly, so I’m not even gonna show how to do this. Use &mut &str instead.

P.S. If you want to change a value of a static, you need static mut or #[link_section = ".data"].

String literals are placed into a read-only section of memory.

However, you can circumvent this and force the compiler to place it on the stack using byte string literals.

let mut string_data = *b"my-string";
let string: &mut str = unsafe { from_utf8_unchecked_mut(&mut string_data) };

A &mut str isn't incredibly useful: there aren't many things you can do with them since it has variable length encoding using utf8.

6 Likes

This isn't necessary and is moderately rude. Either explain why it's not useful, or don't comment on it at all.

This is a completely different type and has completely different uses. The original poster most likely wanted to edit the contents of the string data, not which string pointer they're referring to.

4 Likes

To give the context, when solving this year advent of code day 3 part 1, I created a str using include_str! (or with indoc! from the indoc crate for the unit test), I sorted the slice, and then searched fom duplicates using str::partition_dedup. Sorting and finding duplicate can be done with a &mut str (and I don’t see why both of those function couldn’t be const either).

No, because a string literal is not really useful unless it's 'static, and you can't make it 'static if it's stack-allocated. Statics can't be mutable by default, either, because they are global, so they need some sort of synchronization to be protected from (either multi-threaded or single-threaded) race conditions. So you couldn't directly take a mutable reference to a (static) string literal safely, anyway.

1 Like

There is no such method in str.

If you want to operate on an array of bytes rather than on a str, this works:

let mut x: [u8; 6] = *b"qwerty";
2 Likes

Right, I forgot that I did use a slice and not a str. slice::partition_dedup

EDIT: I used as_bytes to convert the str into a slice of u8, so a mutable str was still needed.

But if you are operating on raw bytes, you can't simply sort them! The whole point of str-as-a-type is to guarantee valid UTF-8, which &[u8] can't. Not all arbitrary byte arrays are valid UTF-8. Therefore, there is no guarantee that sorting the bytes of a &str will still be valid UTF-8. Hence, you can't sort the bytes of a &mut str. You have to use a mutable slice for that.

I get your point. But if I don’t control how the str was created (because it was behind a indclude_str! or indoc! macro), then I’m stuck because I have a non mutable str even if it could have been possible to get one. Or is there a way to create one without extra runtime penalty?

Use include_bytes! instead of include_str!.

3 Likes

To sum-up, I think that indeed having mutable str isn’t extremely useful, and alternative exists if in fine you need an &mut [u8] (b"str", include_bytes, …).


For the record, using this does work an optimize properly.

const STRING_DATA: &str = "qwerty";
let mut my_string: [u8; STRING_DATA.len()] = STRING_DATA.as_bytes().try_into().unwrap();

godbolt

1 Like

Technically yes, practically it's not feasible. Simply because without extra runtime penalty the most you can do is put initial value in the .data segment but they it would only have pre-determined value the first time you would call your function.

Number of bugs this approach have produced in C is so vast that the fact that in Rust that become extremely complicated endeavour can only be considered as a very good thing.

it's possible to do that even in Rust, but that's not trivial — on purpose. Even C ended up with a crazy rules that string literals have type char* but you are no allowed to change them (only compiler wouldn't stop you, you would find out about that when you program would crash and sometimes it would crash not where you [try to] modify the string literal but in some entirely different place).

1 Like

I didn’t say "no runtime cost", but "no extra runtime cost". I totally expect a memcpy to have a new mutable str on the stack (unless it’s a static it that case it’s not needed and could directly be made mut with all the usual issue and restrictions (unsafe to use) of mutable statics).

For curiosity, here is how you can create a safe &mut str from a string literal. (And stack-allocated, for better or for worse; a Vec<u8> could be used instead of an array too.)

const S: &str = "hello world";
    
let mut storage: [u8; S.len()] = S.as_bytes().try_into().unwrap();

let mut_str: &mut str = std::str::from_utf8_mut(&mut storage).unwrap();

// Let's do something with it
mut_str.make_ascii_uppercase();
dbg!(mut_str);   // mut_str = "HELLO WORLD"

(Both of the unwrap()s cannot fail because their conditions are always met; the first is checking the length and the second is checking UTF-8 validity.)

3 Likes

If you're dealing with text which you want to have byte-level mutation but still treat as conventionally UTF-8, then bstr is a useful crate to keep in mind.

2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.