Iterator of Vec<Vec<String> mapping columns and aggregates?

Hello,

I am learning iterators.

I have iterator of Vec (variable res below)
i.e. each next() in my res iterator returns a Vec<String each.
This "simulates" a Vec<Vec<String
i.e. one Vector of many Vectors of String

I want to select column[0] i.e. get(0) of each of the Vec and determine the min() value of col[0],

Below is my first attempt.

The following code works: (including the code in // comments, also works, an alternative version

    let res2 = res // Iter, each next() returns Vec<String>, iter res is built earlier in my code.
        .map(|c: Vec<String>| {
            let t = &"".to_string();
            let cc = c.get(0).unwrap_or(t);
            cc.clone()
        })
        // this also works
        // .map(|c: Vec<String>| {
        //     if let Some(cc) = c.get(0) {
        //         cc.clone()
        //     } else {
        //         String::from("")
        //     }
        // })
        .min();

I would like your comments if the above is a good solution and if not good, please show me how to improve it. I had to use c.clone() because otherwise the code would not compile.
As I read the frequent use of .clone() is not recommended for performance reasons.

many thanks!

Assuming consuming the values is what you want to do (given that you have an iterator returning owned Vecs):

    let res2 = res.flat_map(|v| v.into_iter().next()).min();

Here, v.into_iter().next() returns Option<String>, which will be None if the vector is empty and otherwise with be Some(the_value_that_was_at_index_0). If you just used map, the iterator would now be returning Option<Option<String>>, but flat_map turns this into Option<String> and effectively ignores the None values that result from empty inner vectors.

1 Like

thank you, it works.

What is the significance of using .next() in your flat_map() solution? I know it is required, else the .min() incorrectly produces last column of vector.

Interesting that your solution is shorter and cleaner code but the execution performance is close, yours 17 sec versus my code 18 sec).

One more question for my learning - what if I wanted to select/map out several columns from each one Vec? For example, say I wanted to map col[0] and also col[1] and perform something on then like a concat or change upper/lower case etc?

thank you!

If you don't use .next(), you return an iterator over all of the values in the inner Vec, and flat_map turns this into one big iterator over all the values. It's not looking at the last column, it's looking at each column. .into_iter().next() is sort of like .get(0), except it consumes the Vec entirely. You could also use into_iter().take(1).

The cost of your program may be elsewhere if the clones don't matter. Or, they got optimized away in your original version.

If you wanted to look at columns 0 and 1 individually, you could use v.into_iter().take(2). Or more generally, you could do something like

    .flat_map(|v| v.into_iter().filter(|elem| /* keep or not */))

If you wanted to combine multiple fields, you could do something like

    .flat_map(|v| {
        let mut iter = v.into_iter();
        let mut concat = iter.next()?;
        concat += &iter.next()?;
        Some(concat)
    })

Or

    .flat_map(|mut v| {
        v.truncate(2);
        let right = v.pop()?;
        Some(v.pop()? + &right)
    })

Or many other approaches...

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.