The smartstring crate provides a "small string" optimization, where strings up to 23 bytes (on 64 bit platforms) can be stored inline. The beef crate provides space-optimized Cow
representations that store the Cow
in just three usize
(or even two) rather than the four that std Cow
requires (due to being fully generic).
So why not combine them? Sure, it might be an unholy amalgamation of things with no real use case, but it'll be an interesting learning experience. I'm playing around with creating SmartBeef
, a type that is three u64
big, and can be either
- a heap allocated growable string (
String
); - a borrowed string (
&str
); - an inline string, up to 23 bytes (
str([u8; 23])
); - or a reference counted shared heap string, because why not (
Arc<str>
).
(It might be useful to use a bit to indicate if the borrowed string is static, so SmartBeef<'a>::to_owned() -> SmartBeef<'static>
can avoid copying a static string if it's already 'static
. I don't have that allocated for yet.)
Also: the smartstring crate relies on assumptions about std's String
layout. I don't want to do that, but by still laying out my type to be compatible with std String
, I hope to make it possible for LLVM to optimize away copies, though I don't want to explicitly require this to be true.
Here's an "exposition only" pseudo union
showing my current layout idea:
// Exposition only
#[repr(C, packed, align(usize))]
#[cfg(target_pointer_width = "64", target_endian = "little")]
union SmartBeef {
// identifier: nonzero in bytes 16..=19; zero in bytes 20..=24
GrowableHeap {
pointer: ptr::NonNull<u8>, // bytes 00..=07
length: u32, // bytes 08..=11
_pad: 0u32, // bytes 12..=15
capacity: NonZeroU32, // bytes 16..=19
_pad: 0u32, // bytes 20..=23
},
// identifier: zero in bytes 16..=19; zero in bytes 20..=23
Borrowed {
pointer: ptr::NonNull<u8>, // bytes 00..=07
length: u32, // bytes 08..=11
_pad: 0u32, // bytes 12..=15
_pad: 0u32, // bytes 16..=19
_pad: 0u32, // bytes 20..=23
},
// identifier: 0x01..0x18 in byte 23
Inline {
bytes: [u8; 23], // bytes 00.=22
length: u8 in 0x01 .. 0x18, // byte 23
},
// identifier: 0xFF in bytes 16..=23
SharedHeap {
pointer: ptr::NonNull<u8>, // bytes 00..=07
length: u32, // bytes 08..=11
_pad: 0u32, // bytes 12..=15
_pad: 0xFFFFFFFFFFFFFFFFu64 // bytes 16..=23
},
}
A couple notes:
- The empty string can be encoded as
("", 0u32, 0u32, 0u32, 0u32)
or([0; 23], 0u8)
. I don't see a specific benefit to disallowing a zero-length borrowed string at this moment, and do see a benefit to using it over zero-length inline string.Deref
being justif byte[23] == 0 { str::from_raw_parts(bytes[0..8], bytes[8..16]) } else { &bytes[0..23] }
is very convenient. All zeros being a zero length inline string is possible, but meansDeref
has to check for it, and mutation still can't just set the length to0
(e.g. truncating an inline string), it has to zero out the pointer as well. - I don't know what I want to do on big endian platforms. The obvious and simple solution is to abandon
String
layout compatibility, and just use the above layout and the same code path thatfrom_raw_parts
s theString
(as I don't plan on relying on the std layout). Perhaps the copy elision I want to see is impractical to see from just automatic optimization, and I should drop the pretense of matching std layout and just pack the obvious way that works near identically on all platforms:(ptr: NonNull<u8>, len: u32, cap: u32, _pad: [u8; 7], inline_tag)
.
What do you think?