Using Box<Vec<_>> vs Vec<_>

Hello,

I have an enum with two variants:

pub(crate) enum NodeAssociation<D> {
    Data(Box<Vec<D>>),
    NoAssociation
}

and a tree node struct:

pub(crate) struct Node<D> {
    pub(crate) children: FxHashMap<Box<str>, Node<D>>,
    association: NodeAssociation<D>,
}

The tree will contain potentially millions of these Nodes. My questions are:

Is Box<Vec> helping to reduce size of a single node at a reasonable speed cost? Is this done in practice?

When I run cargo clippy it says that this is unnecessary indirection.
For some big data I tested it on, the runtime was about 0.1s slower when I used Box but memory usage was down from ~650MB to ~540MB.

Another side question, if boxing the vector is actually helpful, is it helpful to box the children: FxHashMap as well?

There's an issue in clippy's repo pointing this out but seems like no one's tried improving the docs yet: (pedantic) The `box_collection` docs are not quite correct. · Issue #9392 · rust-lang/rust-clippy · GitHub

I would say this is a case where you should allow the lint, since you've measured that the memory savings are much bigger compared to the tiny runtime cost.

Hard to say without benchmarking it, but a HashMap is 48 bytes compared to Box's 8 and Vec's 24, so it's definitely possible.

3 Likes

That issue is exactly what I am facing in this situation. Thank you for the reply

Do you need efficient growth of data? Otherwise you could try Box<[D]> as a middle ground.

Hash maps need a certain sparsity for efficiency. A BTreeMap could heavily decrease your memory requirements.

1 Like

Yeah, I need efficient growth of data for the Vec.

For the BTreeMap, I tried it and it doubled my memory cost, maybe I missed some important detail?

Do you also need to grow the children?

Yeah, very frequently

I think you need to wrap the Box<HashMap> into an Option to get a measurable size benefit.

Are the keys unique among nodes?

There's also a variant of the vec that keeps size/capacity on the heap, so it acts as a thin pointer, without extra levels of indirection or giving up ability to grow:

8 Likes

This part doesn't sound correct to me. The null-pointer optimization works for any enum, so NodeAssociation when boxed is a single usize in size, rather than 3 usize.

pub(crate) enum NodeAssociation<D> {
    Data(Box<Vec<D>>),
    NoAssociation
}

fn main() {
    // prints 8 on 64-bit architectures.
    println!("{}", std::mem::size_of::<NodeAssociation<()>>()); 
}

@kornel's idea is quite good. Using thin-vec removes one layer of indirection from your data, improving cache locality and preventing pointer-chasing.

1 Like

To get a complete picture you have to look at the two variants in the two cases:

  • Without Box:
    • Data: 24 bytes + an allocation for the Vec's items
    • NoAssociation: 24 bytes
  • With Box:
    • Data: 8 bytes + an allocation for Vec itself (24 bytes) + an allocation for the Vec's items
    • NoAssociation: 8 bytes

Looking at Data, the version with Box requires a second allocation and uses strictly more memory, but for NoAssociation the Box version uses 1/3 of the memory. So on one hand you're using more memory and allocations, and on the other you're using less memory. The overall result depends on how many instances of Data vs NoAssociation you have[1]. If the vast majority are NoAssociation then you're better off with the Box, because their increased memory usage and allocation will be offsetted by all the memory you'll save in the NoAssociation instances. If instead you have a substantial amount of Data then you may be better off without Box. Note that there is no definitive answer, it depends on the execution patterns.


There is another option you might be interested in. In case your Data variant never stores empty Vecs, then you could replace the whole NodeAssociation with a ThinVec from the thin-vec crate. Its size is 8 bytes, just like your code with Box, but instead of doing the equivalent of Box<Vec<_>> it stores the length and capacity in the same allocation as the elements, saving a whole allocation. It also doesn't allocate when there are no elements, just like your NoAssociation variant does.

Just replacing it with Box<FxHashMap<...>> won't do any good, but if it's often empty then you could apply the same strategy as above. Unfortunately hashmaps are much more complex than Vec, so it's harder to write something like thin-vec for them, so you'll have to some an Option<Box<HashMap<...>>> or similar.


Another source of memory usage are likely those Box<str>. You could try replacing them with some smarter string implementation, especially if you expect them to often be small (i.e. less than 15 bytes or less than 22/23 bytes, depending on the implementation) or you need to clone them around relatively often. For example you could consider ecow::EcoString (which occupies 16 bytes, like Box<str>, but only inlines up to 15 characters) or smol_str::SmolStr/compact_str::CompactString/smartstring::SmartString (which occupy 24 bytes but can inline up to 22 or 23 bytes). Interning string could also be an option depending on what you're doing, although it would likely be a much more complex change than just replacing those Box<str> with a smart string type.


  1. Technically you should count uninitialized instances of Node as if they were containing NoAssociation, for example the ones inside children, but I wouldn't expect them to matter that much ↩︎

7 Likes

I was talkong about boxing the hash map that contains the children. If you just do that instead of using an option you simply move the memory for the map itself (not the items) from one location to another plus the pointer to that memory in every node. If you use an option there is just a zeroed usize in the code for the leave nodes.

If the vast majority are NoAssociation then you're better off with the Box , because their increased memory usage and allocation will be offsetted by all the memory you'll save in the NoAssociation instances. If instead you have a substantial amount of Data then you may be better off without Box .

There are definitely more NoAssociation instances so I agree that Box is better.

In case your Data variant never stores empty Vecs, then you could replace the whole NodeAssociation with a ThinVec from the thin-vec 1 crate.

Data stores empty vecs sometimes so that wouldn't work.

You could try replacing them with some smarter string implementation, especially if you expect them to often be small

Each str in the Box<str> is at most one unicode character, so 4 bytes is the upper limit. I will for sure try out EcoString

Interning string could also be an option depending on what you're doing

Can you link me some resources for that? Potentially it could be a great thing.

Then you might also try to use a char or an arrayvec::ArrayString<4>.

Interning makes more sense when the strings could be a bit longer but you expect the same strings come up over and over.

2 Likes

arrayvec::ArrayString<4> really helped, it reduced memory consumption by at least 80MB.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.