Statically sized strings


Isn’t there a more compact way of creating fixed-size (potentially stack-allocated) strings than this:

use std::str;
let x: &[u8] = &[b'a', b'b'];
let stack_str: &str = str::from_utf8(x).unwrap();

Isn’t a macro motivated here? That macro should check that the characters fed as literals are all correct unicode letters and statically infer the strings size from UTF-8 encoding of those letters.

A crate for this, anyone?

Why do you want it on the stack? Do you need it to be mutable, or have dynamic contents?


I want it to be immutable and fixed-sized.

Interestingly, it could provide non-iterator-invalidating append (with capacity).


OK, asked another way, why don’t you want a normal &'static str literal?


I need it for efficient allocation and compact storage of a large number (million) of small strings (words). Kind of like only the small part of a small-length-optimized vector/string :slight_smile:

Google Chrome (and probably Firefox aswell) makes heavy use of such strings/vectors because it saves both memory and gains performance.

At the cost of code complexity, of course.


Does this look like it has what you’re looking for? It’s a string API backed by a statically-sized array. It’s like a small-string-optimized string, but without the ability to switch to heap storage.


You probably shouldn’t put millions of anything on the stack though.

If you write millions of literal "foo"/"bar"/"baz" strings, they should all be “allocated” compactly in the binary’s .rodata section (or platform equivalent), so I guess I still don’t understand what you’re after.


I’m gonna store them in a Vec of such fixed-length strings. The Vec must, of course, be heap-allocated.

So the goal here is to have only one level of heap-allocation.


Yep, that’s what I had in mind. Thanks!


The call to unwrap here

let mut string = ArrayString::<[_; 3]>::from("foo").unwrap();

shouldn’t be needed. Assuming Rust can figure out byte-count of UTF-8 string literals.


Storing a Vec<ArrayString<[u8; SIZE]>> will waste the extra space if you have a lot of strings that don’t need the entire size. If you’re making a read-only collection of strings, you can get more space efficiency with an arena allocator. That way, instead of laying out your strings in an ArrayString<[u8; 8]> like

first...second..third...fourth.. (. = unused space), you can store them as

firstsecondthirdfourth, with a Vec<&str> storing the offset and length of each string.


Rust doesn’t have great support for compile-time calculations at the moment. There’s a decent chance the optimizer will do constant propagation and see that the error branch of the unwrap is dead code, though.


That is interesting. Can I use a specific allocator “locally” for this type?


I’m aware of that, but most words in human languages fit in the 3 machinewords needed by Vec. On a 64-bit system that means 24 bytes. That can hold 23 english letters plus one byte for length. Most english words are smaller than that.


If there are some strings that don’t fit in the fixed-size storage, then ArrayString alone won’t work. You’ll need either a reimplementation of C+±style small-string-optimized strings, an enum storing both a String and an ArrayString (which wastes space for the discriminant), or two separate arrays, one Vec<String> for the long ones and one Vec<ArrayString<[u8; 24]> for the rest.

You could also consider storing Vec<Id>, where Id is a type that stores the offset and length of a given slice in your backing storage, but uses less than 64 bits for each.


Hi nordlow

I had a similar issue some time ago, but for byte-strings. The solution was to write a macro evaluating the bytesize of the string during compile time.

I used it for