Container embedded string for short strings optimization


Hi Rustaceans,

In this weekend I built a small prototype for short strings optimization.
The idea is to embed a custom string type into the container (say, a hash table), and let the container to decide the threshold for having the string data stored inline in the container, or use a raw pointer for indirection. The major use case is to optimize hash tables which uses many short strings as keys.

The interface to use the embedded string (and a specialized hash table) looks like this:

// A new string type called embedded string on stack. Not very useful to use it solely.
// `u8` can be replace by `_`.
let estring: EString<[u8; 24]> = EString::new();

// Default EString size. We can use this type alias so that we don't have to specify the
// generic type argument <[_; 16]> everytime.
type EStringDefault = EString<[_; 16]>;

// If we use weird embedded size like 0, the type is smart enough to extend the size to
// hold the pointer. In my current implementation, the minimal size is 16 (size_of(*mut str)).
// I may improve it to use *mut u8, where size is 8 instead of 16.
let safe_estring<[_; 0]> = EString::new();

// Using EString with a hash table
let my_table = EHashMap::<EStringDefault, EStringDefault>::new();

// Using EString with a hash table with custom embedded size.
// In this case, we assume most of our values fits in 250 chars.
let my_table2 = EHashMap::<EStringDefault, EString<[_; 250]>>::new();

All the above can be implemented in stable rust.

I know there was a previous discussion about SSO here:

Yet the embedding type into another type wasn’t the focus.
This is why I start a new thread to discuss about it.

To release the code, I need to get approval from my working company (google).
Feel free to leave your comment for this idea if it interests you.


You can compare notes against smallvec.


Thanks for sharing!