And I couldn't find insert() and delete() that return the modified tree keeping the older version persistent. Also, I think due to the RRB-Tree limitation the mutable version of insert() and remove() are costly operation. Please correct me if I am wrong here.
ppar is persistent array implementation using a modified variant of rope-data-structure. Fundamentally, it can be viewed as a binary-tree of array-blocks, where each leaf-node is a block of contiguous item of type
T, while intermediate nodes only hold references to the child nodes, left and right. To be more precise, intermediate nodes in the tree are organised similar to rope structure, as a tuple of (weight, left, right) where weight is the sum of all items present in the leaf-nodes under the left-branch.
ppar::Vector performance characterization ----------------------------------------- append-load(1000000 items) : 9.094956ms random-load(1000000 items) : 6.128µs get(1000000 ops) : 127ns insert(1000000 ops) : 8.936µs set(1000000 ops) : 6.093µs delete-insert(1000000 ops) : 11.742µs overhead : "33.87%" overhead after 90% delete : "33.91%"
- Takes around 9ms to convert a Vec of len 1-million items to
- Random insert operation take around 8us.
- Random get operation take around 100ns.
- Random set operation take around 8us. Note that these set operation update an existing offset.
- Random delete operation take around 4us. Note that in the above perf data, after a delete an insert is performed at the same offset.
- For 8-byte u64 type, overhead is around 33%.
- I believe the worst-case overhead can go upto 100%, that is 2x, the actual data, when compared to
On same machine same set of operation on a
im::Vector<T> gives following numbers:
append-load(1000000 items) : 37.651219ms get(1000000 ops) : 123ns update(1000000 ops) : 10.344µs insert(1000000 ops) : 735.307µs delete-insert(1000000 ops) : 1.426266ms
append-load operation is bulk-operation, other ops are individual metric.