Shared immutable strings

I have something where tens of thousands of items have a link to the same moderately long string. The string is created dynamically, so there are lifetime issues if I try to pass it around as &str. So I need a reference counted immutable shared string type.

crates.io has some options:

  • arccstr - null-terminated C-type strings. Don't want that.
  • quetta - right functionality, 57 downloads.
  • immutable_string - best name, requires a global pool, 157 downloads.
  • shared_string - the right idea, 632 downloads.

The download counts are tiny. Am I missing some more popular crate? I'd expect a widely used immutable string type.

What about Arc<str>?

7 Likes

If the String is created once in your application's lifetime, where you're in a situation where you are looking to create something at init and don't care after that, you can use Box::leak to get a &'static str, you will "just" not have a way to give the memory taken by that String back to the allocator until the end of the application's run:

Box::leak(format!("{value}").into_boxed_str())
1 Like

If Arc<str> isn't what you want, you can also check out the "small string" crates, as many of those also have O(1) clone via sharing, as well as string interning crates, which provide the general task you're looking for (string deduplication) in a more general manner (pooling multiple such strings).

If you want substrings to work, then you can use the bytestring crate, which uses a Bytes for reference counting the string data.

3 Likes

What looks nice about shared_string is the as_str() where with bytestring you seem to lose utf8 validity along the way.

bytestring::ByteString implements Deref<Target = str>, AsRef<str>, and Borrow<str>, to allow users to access a &ByteString as a &str.

1 Like