Avoid allocations for many, mostly similar structures


#1

Hi everyone,

I’ve got a lot of small, mostly overlapping instances of structures that exist transiently in my system. Say they’re like so:

struct Telemetry {
    name: String,
    value: quantiles::CKMS<f64>,
    meta: Vec<(usize, String)>
}

The instances of these structures mostly overlap. That is, if you could stop the world and inspect every one of them you’d see several thousand all with the same name, several hundred with the same value and almost all with the same meta. There are modifications done to Telemetry instances in-flight but much of the access is read-only.

The problem I’ve got is that right now I’m allocating one new Telemetry every time I need to send one across an ownership boundary while retaining ownership in my current scope. I could wrap these things in an Arc excepting that I need to modify some of them sometimes. When I do modify a Telemetry I need the original owner not to see the modification.

I believe I need is some kind of car-crash of Arc and Cow where I can copy a Telemetry around without allocating a new one excepting when I perform a write. Accomplishing this has been quite a puzzler for me and I’d very much appreciate help or suggestions.


#2

I think all you need is Arc<Telemetry>. Use Arc::clone to bump the refcount and copy a pointer around, without copying the Telemetry that it points to. Then use Arc::make_mut when you want to modify a Telemetry with copy-on-write semantics. (If the Arc is shared, make_mut will clone the Telemetry so the other owners don’t see the modifications.)


#3

Since you say almost all your instances have the same value for meta you could also put the meta field in an Arc of its own, so you can modify the other fields without copying its contents.

Also, as you probably already know, you can use Rc<T> instead of Arc<T> if your don’t need to share pointers across multiple threads.


#4

Oh boy! I didn’t realize that make_mut would do a clone, but it’s right there in the documentation:

If the Arc has more than one strong reference, or any weak references, the inner data is cloned.

This is perfect. Thank you!