Get a random element from BTreeMap


#1

Suppose we have a very large BTreeMap<K, V> and in addition to the ‘proper’ usage, we want to choose entries randomly from time to time.

Is this possible? A naive solution would be to store another “structure” with say <i64, K> pairs. Then generate integers at random, get the key and use it to find V in the BTreeMap. However if the BTreeMap is very large, say in the GBs, then this additional structure is also large…

Maybe there is a way to do it, without much additional overhead?


#2

Not the most performant solution, but if you need to do it for whatever reason (without having a side structure) you can just grab the length and then run iterate you hit it. It’s not ideal, but it’s probably the cheapest way to do it without knowing what K is. Logically I’d write a small function which takes the tree (mutably or not, depending on needs), then iterating until it hits, then returning. You can even do the random roll there, or return key + value by reference (again mutable or not).

I’m sure there’s a better solution I’m not thinking of, but since it’s a BTree, it limits your options.


#3

Yes you can just iterate a random number of times through. But we really need something fast, i.e. O(1) for each access.


#4

Maybe use the IndexMap crate which is like the hashmap but keeps the insertion order. It has a method called get_index

Edit:
After looking at the crate again I realized it doesn’t support btreemaps, only hashmaps, so I guess that doesn’t help much.


#5

Oh wow, thanks! IndexMap is indeed much better suited for our purposes. I didn’t knew that it exists as a resizeable hash-table. Great!