What's the ideal way to convert Vec<(K1, K2, Kn, V)> to Map<K1, Map<K2, Map<Kn, V>>>?

This is my current way (a little complicated so MIT licensed)

use std::collections::{BTreeMap, btree_map::Entry};

fn get_or_insert<K: Ord, V>(map: &mut BTreeMap<K, V>, k: K, init_v: fn() -> V) -> &mut V {
    match map.entry(k) {
        Entry::Vacant(entry) => entry.insert(init_v()),
        Entry::Occupied(entry) => entry.into_mut(),
    }
}

fn seq2map_k3<K1: Ord, K2: Ord, K3: Ord, V>(seq: Vec<(K1, K2, K3, V)>) -> BTreeMap<K1, BTreeMap<K2, BTreeMap<K3, V>>> {
    let mut res = BTreeMap::new();
    for (k1, k2, k3, v) in seq {
        let entry1 = &mut res;
        let entry2 = get_or_insert(entry1, k1, BTreeMap::new);
        let entry3 = get_or_insert(entry2, k2, BTreeMap::new);
        let _ = entry3.insert(k3, v);
    }
    res
}

fn main() {
    let seq = vec![
        (1, 2, 3, 100),
        (1, 2, 4, 200),
        (2, 3, 4, 300),
    ];
    let seq = dbg!(seq);
    let map = dbg!(seq2map_k3(seq));
    dbg!(map[&1][&2][&4]);
}

playground

BTW is there currently a method like get_or_insert for std maps? If not, I think it would be a good idea to add one.

Both Vec and BTreeMap implement IntoIterator and FromIterator, so you can collect the iteration from one collection type into the other.

You can use the methods on Entry, like or_default, for this:

fn seq2map_k3<K1: Ord, K2: Ord, K3: Ord, V>(seq: Vec<(K1, K2, K3, V)>) -> BTreeMap<K1, BTreeMap<K2, BTreeMap<K3, V>>> {
    let mut res: BTreeMap<K1, BTreeMap<K2, BTreeMap<K3, V>>> = BTreeMap::new();
    for (k1, k2, k3, v) in seq {
        res.entry(k1).or_default()
           .entry(k2).or_default()
           .insert(k3,v);
    }
    res
}
5 Likes

One more question: Is there a more optimal data structure for a multi-level random lookup table like this, other than nesting? One way I can think of is to create a interrelated index of keys, point to index of the sequence. But how to construct the interrelated index?

It’s not bad, but there are also other reasonable options:

  • Use a single map with the key (K1, K2, K3)
  • Store the original vector, and also indices Map<K1, HashSet<usize>>, Map<K2, …>, and Map<K3,…>

FromIterator does not currently support such conversions:
the trait FromIterator<(K1, K2, K3, V)> is not implemented for BTreeMap<K1, BTreeMap<K2, BTreeMap<K3, V>>>

The meaning of a "random lookup table" is that I not only need to get Vs from (K1, K2, K3), but also (K3, V)s from (K1, K2), (K2, K3, V)s from K1, and maybe even (K2, V)s from (K1, K3), etc.
Described with the relational database model, the index I mentioned earlier is analogous to a table containing v_index as the primary key and three other fields k1, k2, k3, with indexes created for all of them. (In other words, embedding a sqlite running in memory mode in the application would solve the problem, but obviously overkilled.)

I see. Didn't see in your original message that you had nested maps that depended on existing entries or the top map.

I'm not the kind of person who reach out their hands and ask for everything in the community, and I'm already continuing to think about how to implement this index. This is likely not something that can be described in a few sentences.

Right. The most general form of this is my second alternative, something like this:

use std::collections::{BTreeMap, BTreeSet};

pub struct Table<K1, K2, K3, V> {
    data: Vec<(K1, K2, K3, V)>,
    idx1: BTreeMap<K1, BTreeSet<usize>>,
    idx2: BTreeMap<K2, BTreeSet<usize>>,
    idx3: BTreeMap<K3, BTreeSet<usize>>,
}

impl<K1, K2, K3, V> Table<K1, K2, K3, V>
where
    K1: Ord + Clone,
    K2: Ord + Clone,
    K3: Ord + Clone,
{
    pub fn insert(&mut self, k1: K1, k2: K2, k3: K3, v: V) {
        let i = self.data.len();
        self.idx1.entry(k1.clone()).or_default().insert(i);
        self.idx2.entry(k2.clone()).or_default().insert(i);
        self.idx3.entry(k3.clone()).or_default().insert(i);
        self.data.push((k1, k2, k3, v));
    }
}
1 Like

The solution is so simple in essence! I will encapsulate this structure in a separate crate.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.