[lib] str_stack: A string allocation library


#1

str_stack is a string allocation library for allocating many temporally correlated write-once strings. This is useful when you need to e.g. read a set of small files into strings but don’t want to allocate a new string per file. Additionally, this library provides great cache locality.

Currently, the performance (on my laptop) is as follows:

  • Allocation: ~2.5x speedup (for 1000 strings) (~42ns per string)
  • Indexing: 0.73x speedup (slower) (~1.7ns per index)
  • Iterate: 0.35x speedup (much slower) (~1ns per iteration)

It should be possible to further speedup allocation (see the open issue) but speeding up indexing/iteration are beyond me.

I’ll put it on crates.io if people find it useful but, for now, I don’t even know if this is the right approach.


#2

You’re the first one I see to use the ducktypiness of the write!() macro, and that explains why I was so confused when I saw it used on a string stack. It’s a cute experiment, only experience will tell if it’s a good idea.

You can use format_args!() to do kind of the same thing without punning the write_fmt method name. Maybe that’s just as cryptic though:

stack.write(format_args!("Hello {}", "World"));

#3

You can do that now:

stack.write_fmt(format_args!("Hello {}", "World"));

(that’s literally all the write macro does)

However, I hate format_args!() with a burning passion. It puts stuff on the stack which has given me endless grief before.


#4

Interesting! Have you considered inlining the lens using the XOR linked list trick? In this case you can just use addition instead of xor.

So your array would look like:

[a.len, a, a.len + b.len, b, b.len + c.len, c, c.len]

e.g.

[5hello12world!!10boo3]

where the numbers between take up a usize of bytes.

Easy to add/remove to the end, and no need for multiple allocations!

Also, double-ended iteration!


#5

But really slow indexing. My current plan is to use two inward growing stacks:

[a, b, c, d, ............, d_end, c_end, b_end, a_end, a_start (0)]

(including a_start makes some of the indexing/iteration logic simpler)