Tree structure implementation using ordered list and depth

There are a number of crates that encode tree structures in a Vec such as id_tree. In simplest case they build tree structure by pointing to other indices in the Vec for things like parent and children references.

Are there any crates that encode tree in Vec by just storing a depth along with each item in the Vec?

So for example:

depth value
0 a
1 b
2 c
1 d
   a
  / \
 b   d
 |
 c

Is there a name for this sort of tree encoding?

This may not answer your question but just a comment.

It is well known that there is a natural isomorphism between the ordered trees and balanced parenthesis sequences (see e.g. https://en.wikipedia.org/wiki/Catalan_number).

In fact, the set of integer sequence d of length n such that

  • d[i] >= 0 for all 0 <= i < n,
  • d[0] = 0
  • d[i+1] - d[i] <= 1 for all 0 <= i < n-1

precisely represents the set of preordered depth sequence of trees of n nodes.

The balanced parenthesis (BP) representation of a tree is used in succinct data structures because it can be used to store a tree of n nodes in 2n bits and it is asymptotically optimal.

Storing depths can be considered as an implementation of BP representation, I think. However, depth information is usually stored with a BP representation in more space-efficient ways.

For example, by only storing depths per 64 positions, the depth of a node at position i can be computed by something like: depth[i / 64] + i % 64 - (bp[i / 64] & ((1 << i % 64) - 1)).count_zeros() * 2 where bp is the bit-vector of the parentheses. This is the naive form of the rank-dictionary structure.

2 Likes

Thanks that's all very useful and an area I didn't know about.

It's not a general-purpose crate, but there's something similar to this in the internals of the unicode-bidi crate.

This library breaks text into left-to-right and right-to-left text runs. These text runs can be nested within each other (for example, an English sentence might contain a quote in Hebrew, which in turn contains a name in Vietnamese), so they naturally form a tree structure. The Unicode Bidirectional Algorithm encodes this tree structure as a sequence of “embedding levels,” which correspond to the depth of the tree at each point in the string.

2 Likes

You might also be interested in using a heap, if it suits your use case. It is a partially ordered data structure represented by a full binary tree usually in flat array form, so the index of a node implicitly encodes its depth.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.