Function taking Iterator, quick review on correct use of iter(), reference, and generics

Hello,

I would much appreciate your feedback on this simple function that takes a generic Iterator and returns both min and max in a single pass.

I have read that, when possible, calling .iter() should be preferred to a direct for loop or .into_iter() because it borrows the elements to yield. And this is what I'm trying, am I correct? Is the distinction relevant in this case?

Then, the function signature that takes the obtained iterator (from calling .iter()) should specify that the associated item type is a reference to a generic type, correct?

Finally, does the fact that I'm getting and iterator and passing it to the function cause some copy/clone of the data? i.e., the function doesn't take a reference to the iterator, but the iterator itself.

Am I correct in assuming that the only clone will happen at the end of the function to return the owned min and max (rather than references)? I guess f64 may be copied during the min-max loop, but that is cheap and, thus, in line with Rust principles, correct?

How could I make it more idiomatic?

thank you very much in advance,

Luca

fn main() {
    let v = [f64::NAN, f64::NAN, 1.1, f64::NAN, 2.5, 3.2, f64::NAN];
    let iterv = v.iter().filter(|x| !x.is_nan());
    let (min, max) = min_and_max(iterv);
    assert_eq!(min, 1.1, "min {}", min);
    assert_eq!(max, 3.2, "max {}", max);
    assert_eq!(v[2], 1.1, "v[2] {}", v[2]);

    let v = [String::from("c"), String::from("a"), String::from("b")];
    let iterv = v.iter();
    let (min, max) = min_and_max(iterv);
    assert_eq!(min, String::from("a"), "min {}", min);
    assert_eq!(max, String::from("c"), "max {}", max);
    assert_eq!(v[2], String::from("b"), "v[2] {}", v[2]);
}

fn min_and_max<'a, I, T>(mut s: I) -> (T, T)
    where
        I: Iterator<Item=&'a T>,
        T:
            'a +
            std::cmp::PartialOrd +
            Clone, 
    {
    let (mut min, mut max) = match s.next() {
        Some(v) => (v, v),
        None => panic!("could not get initial values"),
    };
    for es in s {
        if es > max {
            max = es
        }
        if es < min {
            min = es
        }
    }
    return (min.clone(), max.clone());
}

You might be interested in looking at how Itertools::minmax does it.

Yours appears to essentially be .minmax().into_option().unwrap(), and seeing why it ends up having those two differences may be illuminating.

1 Like

You seem to have a solid grasp in the statements you’re making

The code seems overly complex. There may be value in thinking about iteration separately from the min, max search. Iteration is generic, the search is specific.

When I think of min or max I think “reduction”.

Iteration essentially boils down to a choice between reduction and map, where map applies a function to each value but leaves the structure of the collection unchanged.

So given that’s it’s a reduction, I next think “accumulator”. In your case, you need a tuple. That is the only memory you need above what is required to instantiate the Iterator (as an aside: filter is also a reduction where the accumulator starts as an empty version of the collection used to collect()).

Finally, you need your reducing function plus a “dummy” starting point (aka neutral value of T).

// T, T -> (T,T)

let (min, max) = collection
  .iter() // only need to read the data
  .fold( 
    (startMin, startMax), 
    |(min, max), itemTryMinOrMax| { 
       ... set value of acc by deref’ing the borrow (copy) when it makes sense; 
       (min,max) // the accumulator 
     })

A “neutral” starting value for the tuple might be the first value in your collection.

1 Like

Thank you for sharing your reasoning in terms of reduction and map very helpful, thanks to it I went back to read the documentation for fold and map and understood more of their general motivations, implementations, and terminology.
Following your suggestions, I wrote this for when I am ok/want to dereference the iterator item (e.g., f64).

    let (min, max) = v
      .iter()
      .fold( 
        (0, 0), // just for the example 
        |(mut min, mut max), x| { 
            if x > &max {
                max = *x
            } if x < &min {
                min = *x
            }
           (min,max) // the accumulator 
         });

However, am I correct in assuming that this approach would actually dereference (copy or clone) the item every time a new max or min is found?
Would the following solution make sense then, avoiding the dereferencing?

    let v = [String::from("c"), String::from("a"), String::from("b")];
    let start = String::from("b");
    let (min, max) = v
        .iter() // only need to read the data, iter will yield &T, which is x
        .fold( 
        (&start, &start), 
        |(mut min, mut max), x| { 
            if x > &max { // compare &String with &String
                max = x
            } else if x < &min {
                min = x
            }
           (min, max) // the accumulator, holding references 
         });
    let (min, max) = (min.clone(), max.clone()); // clone if needed for convenience

Thank you very much again.
Best,
Luca

Thank you very much, it was a good idea to see some advanced solution, I didn't know it was there...

Yes. However, you are reusing the same memory allocated for (min, max) over and over (onetime allocation). Also, the copy is a bit-wise copy so is cheap/fast.

The solution you proposed avoids a copy but does so by taking ownership of the value. What are you using to access each value of your collection? iter() is a constructor of an iterator that borrows a view of the value Would it be cool to take ownership of something borrowed? :))

When I use into_iter() I do so when I don’t have any further use for the collection. That could include using the data to instantiate another collection. I don’t normally use into_iter() for a reduction.