How to combine Vec<(Uuid, _)>'s by intersecting Uuid

Hey there. I'm trying to "merge" two Vec's returned from a database, and feel there must be a more elegant solution. I have two Vec<(Uuid, Vec<&str>)> variables, for example:

// let uuid_1 = uuid::Uuid::now_v7();
// let uuid_2 ...
let a = vec![(uuid_1, vec!["a", "b", "c"]), (uuid_2, vec!["a", "b", "c", "d", "e"])];
let b = vec![(uuid_3, vec!["a", "b"]]), (uuid_1, vec!["c","d", "e", "f"])];

I wish to combine the inner tuple Vec<&str> (element 1) of any intersecting element within a or b, for the result:

/// let r = combine(a, b);
assert!(
    // It is acceptable for the inner `Vec<&str>` to have duplicate variables, as is the case for `"c"` of `uuid_1`
    r ==
    vec![(uuid_1, vec!["a", "b", "c", "c", "d", "e", "f"]), (uuid_2, vec!["a", "b", "c", "d", "e"]), (uuid_3, vec!["a", "b"])]
    // It is also acceptable for the inner `Vec<&str>` to remove duplicates
    || r ==
    vec![(uuid_1, vec!["a", "b", "c", "d", "e", "f"]), (uuid_2, vec!["a", "b", "c", "d", "e"]), (uuid_3, vec!["a", "b"])]
);

The only approaches I can think of are to either iterate vector b for each iteration of vector a (or vice-versa). Somewhat similar to:

let mut r == vec![];
'outer for x in a {
    for y in b {
        if x[0] == y[0] {
            // Something along the lines of: remove from `b`, chain x[1] and y[1], push to `r` and continue 'outer;
        }
    }
}

Or, what I expect to be a worse approach, instantiate another struct from these Vec's that can more conveniently compare Uuid's, e.g. a HashMap.
Is there a more elegant approach to what I'm trying to do, one that preferably removed the need to iterate one Vec for every iteration of the other(like the example above)? I'm especially interested in any iterator methods that could make the code much more elegant, and remove for loops altogether.

To list one more alternative: you could sort both by the uuids and then merge them (merge-sort style). Whether that's slower or faster than using HashMap, I don't know.

Both these should be better than the approach of looping through one vector for every element of the other vector.

5 Likes

Depending on what you use to interface with your database, you can probably directly collect what it returns into hashmaps instead of going through vecs.

for (uuid, list) in map2 {
    map1.entry(uuid).or_default().extend(list); // maybe with filtering pass
}
2 Likes

You can use the coalesce() method from the itertools crate.

fn combine<'a>(
    a: &Vec<(Uuid, Vec<&'a str>)>,
    b: &Vec<(Uuid, Vec<&'a str>)>,
) -> Vec<(Uuid, Vec<&'a str>)> {

    let mut c = [a.as_slice(), b.as_slice()].concat();

    // need to sort because coalesce works on consecutive elements
    c.sort_unstable();

    c.into_iter()
        .coalesce(|mut x, y| {
            if x.0 == y.0 {
                x.1.extend(y.1);
                // Uncomment the final two lines of this comment if
                // you want to remove duplicates, like "c" from
                // ["a", "b", "c", "c"]
                // x.1.sort_unstable();
                // x.1.dedup();
                Ok(x)
            } else {
                Err((x, y))
            }
        })
        .collect()
}

I'm not sure if this method produces the most performant machine code for your use case, but it is fairly concise.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.