Is there some approach to generalize over Iterator<Item = T> and Iterator<Item = &T>?


#1

I’m trying to write a generic function to calculate the average string length of an iterator of strings. For the most common case, i.e. slice.iter(), I wrote this:

fn avg_len<'a, I: Iterator<Item = &'a String>>(iter: I) -> f32 {
    let (total_chars, size) = iter.fold((0, 0), |c, b| (c.0 + b.len(), c.1 + 1));
    total_chars as f32 / size as f32
}

This works fine in this case:

let words = ["hello world".to_string(), "rust language 1.25".to_string().to_string()];
let avg = avg_len(words.iter());

However, if I try to apply it to a mapped iterator, I get a compiler error:

let whitespaces = words
  .iter()
  .map(|s| s
    .chars()
    .filter(char::is_ascii_whitespace)
    .collect::<String>()
  );

// Compiler complains here:                           
let whitespace_avg = avg_len(whitespaces);
//                                 ^^^^^^^ expected struct `std::string::String`, found reference

This can be fixed if I write avg_len_2() with the same body as avg_len() but with a small signature change:

fn avg_len_2<I: Iterator<Item = String>>(iter: I) -> f32 {
    let (total_chars, size) = iter.fold((0, 0), |c, b| (c.0 + b.len(), c.1 + 1));
    total_chars as f32 / size as f32
}

Since both functions have the same body, I think that they could be generalized, but can’t find the right signature to do it.
Is there some approach to solve this with generics/static dispatch?
Thanks in advance.


#2

For any T in general, you can use Borrow<T>:

use std::borrow::Borrow;

pub fn avg_len<I, T>(iter: I) -> f32
where
    I: Iterator<Item = T>,
    T: Borrow<String>,
{
    let (total_chars, size) = iter.fold((0, 0), |c, b| (c.0 + b.borrow().len(), c.1 + 1));
    total_chars as f32 / size as f32
}

For strings in particular, I would use AsRef<str>, and then String, &String and &str all work.

pub fn avg_len<I, T>(iter: I) -> f32
where
    I: Iterator<Item = T>,
    T: AsRef<str>,
{
    let (total_chars, size) = iter.fold((0, 0), |c, b| (c.0 + b.as_ref().len(), c.1 + 1));
    total_chars as f32 / size as f32
}

#3

One thing about this – while they look the same, the b.len() was behaving slightly differently due to automatic ref/deref in method dispatch. In your version where the items are &String, it’s calling String::len(b), but when the items are String it’s calling String::len(&b). I don’t think generic code can automatically distinguish that.


#4

Thank you very much, your suggested signature is working fine!
I will take into account your behavior distinction for this kind of generic code. I think the expected result will be the same for both cases, but I’m now aware that behind the scene, distinct behaviors are going on.


#5

I realized I didn’t really need to introduce that T. With where bounds, it can also look like this:

pub fn avg_len<I>(iter: I) -> f32
where
    I: Iterator,
    I::Item: AsRef<str>,
{ ... }

And with impl Trait in Rust 1.26, you can do it all implicitly:

pub fn avg_len(iter: impl Iterator<Item = impl AsRef<str>>) -> f32 { ... }