Idiomatic alternative to writing a function that returns an iterator

In doing the Advent of Code puzzles for 2022, I'm using a lot of iterator chains. That feels natural to me, coming from a Python background where generators are ubiquitous, and it seems like it's common practice in Rust too.

But in Python, I'd typically write functions that capture my generator logic, and re-use those functions. But that seems awkward in Rust. To give a concrete example,

fn main() {
    let evens_squared = (1..100).filter(|x| x % 2 == 0).map(|x| x * x);
    let v: Vec<_> = evens_squared.collect();
    dbg!(&v);
}

Ideally, I'd like to write that with evens_squared as a function taking the maximum value, so something like this:

fn evens_squared(n: usize) -> SomeType {
    (1..n).filter(|x| x % 2 == 0).map(|x| x * x)
}

But I don't know what to put as the return type. By checking the actual type of the return expression, I can see that it's Map<Filter<...>> but I don't want to put that - first, because it's annoyingly complicated, and second because it exposes implementation details of the function that I might want to change later. I assumed that the right way of doing this would be with some sort of generic return value, but I tried Box<dyn Iterator> and got bogged down in lifetime issues (because of the closures?)

I'm guessing that "idiomatic Rust" would handle this in a different way. What would be the natural approach for something like this?

fn evens_squared(n: usize) -> impl Iterator<Item = usize> {
    (1..n).filter(|x| x % 2 == 0).map(|x| x * x)
}

(Playground)

8 Likes

First, Box<dyn Iterator> is not a way to go generic - it's a way to go dynamic, that is, to move the dispatch from compile-time to runtime. It's a good thing sometimes (generics can be quite unwieldy if they're too complex, and compile time sometimes unexpectedly explode), but it's a thing one should understand.

Second, I'm curious what kind of issues did you have? With your example of evens_squared boxed iterator seems to work out of the box (pun unintended):

fn evens_squared(n: usize) -> Box<dyn Iterator<Item = usize>> {
    Box::new((1..n).filter(|x| x % 2 == 0).map(|x| x * x))
}

fn main() {
    let v: Vec<_> = evens_squared(100).collect();
    dbg!(&v);
}

Playground
Probably your real use-case is more complicated then that?

3 Likes

Bah. I missed out the Box::new call. Because (as you say) I wasn't thinking in terms of creating a dynamic return type, as that's not what I need here, just a way of naming the static type.

And as @jbe pointed out, impl Iterator<Item = usize> does exactly that. I don't know how I missed that, I think I got confused between Iterator and IntoIterator and tried the latter rather than the former. I was also trying to write functions that take general "iterable things" and had been experimenting with IntoIterator for that, and it all got a bit of a muddle :slightly_smiling_face:

Thanks for the extremely fast and informative responses. I'm glad the answer turned out to be straightforward, and I hope my dumb question didn't waste too much of anyone's time.

Note that you might also run into the common confusion with impl Trait - namely, the fact that impl Trait in argument position and in return position have different semantics. As an argument, it's a syntax sugar for generic parameter, i.e. it's caller-chosen. As a return type, like here, it's an "opaque type", which is function-chosen.

3 Likes

No problem at all. For me it's a nice practice to browse through the forum and look for problems and figure out the solution. It helps to memorize language constructs better.

As an exercise, I tried to see if I can name the type explicitly. Note that the following is not idiomatic, and it likely comes with some runtime overhead because the closures (which are in fact functions because they don't capture any environment) must be cast to function pointers (see also: What is the exact type of a function?):

use std::iter::{Filter, Map};
use std::ops::Range;

fn evens_squared(n: usize) ->
    Map<
        Filter<
            Range<usize>,
            for<'a> fn(&'a usize) -> bool
        >,
        fn(usize) -> usize
    >
{
    (1..n).filter((|x| x % 2 == 0) as fn(&usize) -> _).map((|x| x * x) as fn(_) -> _)
}

fn main() {
    for x in evens_squared(8) {
        println!("{x}");
    }
}

(Playground)

Type inference is a bit stubborn for the filter closure, I had to specify the argument type explicitly on the right side of "as". In particular, it doesn't work to write (|x: &usize| x % 2 == 0) as fn(_) -> _, as shown in the following:

-    (1..n).filter((|x| x % 2 == 0) as fn(&usize) -> _).map((|x| x * x) as fn(_) -> _)
+    (1..n).filter((|x: &usize| x % 2 == 0) as fn(_) -> _).map((|x| x * x) as fn(_) -> _)

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error[E0599]: the method `map` exists for struct `Filter<std::ops::Range<usize>, fn(&usize) -> bool>`, but its trait bounds were not satisfied
  --> src/main.rs:13:59
   |
13 |     (1..n).filter((|x: &usize| x % 2 == 0) as fn(_) -> _).map((|x| x * x) as fn(_) -> _)
   |                                                           ^^^ method cannot be called on `Filter<std::ops::Range<usize>, fn(&usize) -> bool>` due to unsatisfied trait bounds
   |
   = note: the following trait bounds were not satisfied:
           `<fn(&usize) -> bool as FnOnce<(&usize,)>>::Output = bool`
           which is required by `Filter<std::ops::Range<usize>, fn(&usize) -> bool>: Iterator`
           `fn(&usize) -> bool: FnMut<(&usize,)>`
           which is required by `Filter<std::ops::Range<usize>, fn(&usize) -> bool>: Iterator`
           `Filter<std::ops::Range<usize>, fn(&usize) -> bool>: Iterator`
           which is required by `&mut Filter<std::ops::Range<usize>, fn(&usize) -> bool>: Iterator`

For more information about this error, try `rustc --explain E0599`.
error: could not compile `playground` due to previous error

What's wrong here? I don't really understand the problem here.

It inferred the cast to be a fn(&'local) -> bool instead of a for<'any> fn(&'any usize) -> bool, similar too the typical closure problem. I think your as cancelled out the usual calming effects of being explicit in the closure argument, because Rust took the RHS of the cast as the source of truth. E.g. this works:

(1..n).filter((|x| x % 2 == 0) as fn(&_) -> _).map((|x| x * x) as fn(_) -> _)

...or (moving away from casting considerations) this also works...

fn evens_squared(n: usize) -> FilterMap<Range<usize>, fn(usize) -> Option<usize>> {
    (1..n).filter_map(|x| (x % 2 == 0).then(|| x * x))
}

...but in case anyone reading doesn't know, using fn pointers incurs an indirection and may inhibit inlining optimizations, similar to Box<dyn Fn> (albeit without an allocation).

1 Like

I was about to say "just write -> _ and the error message will tell you!", but sadly that doesn't work for this case. (Often there's a help message with what to write.)

I opened an issue :slight_smile: Consider suggesting RPIT for `-> _` functions returning iterators · Issue #106096 · rust-lang/rust · GitHub

3 Likes

Shouldn't this then work on Nightly?

-    (1..n).filter((|x: &usize| x % 2 == 0) as fn(_) -> _).map((|x| x * x) as fn(_) -> _)
+    (1..n).filter((for<'a> |x: &'a usize| x % 2 == 0) as fn(_) -> _).map((|x| x * x) as fn(_) -> _)

(Playground)

But it doesn't. And not sure what the error means:

Errors:

   Compiling playground v0.0.1 (/playground)
error: implicit types in closure signatures are forbidden when `for<...>` is present
  --> src/main.rs:15:43
   |
15 |     (1..n).filter((for<'a> |x: &'a usize| x % 2 == 0) as fn(_) -> _).map((|x| x * x) as fn(_) -> _)
   |                    -------                ^
   |                    |
   |                    `for<...>` is here

error: could not compile `playground` due to previous error

I think it's asking you to also specify the return value type.

As specified in the RFC, the feature is as minimal as plausible and you can't elide the return type.

But, even with for<'a>, you still need this part it seems:

.filter((for<'a> |x: &'a usize| -> bool { x % 2 == 0 }) as fn(&_) -> _)
//                                                            ^

So really it's the same as your "what's wrong here" I orginally replied to; the interpretation of the cast is still overriding the LHS being higher-ranked, whether that's via a fully explicit for<'a> or just a |x: &_|.

1 Like

Perhaps just call .collect in you function and use a Vec as your return type. Call iter or into_iter on the Vec return value when you want to iterate over the result.

Update: @ekuber is amazing and this is already available in nightly! :tada:

error[E0121]: the placeholder `_` is not allowed within types on item signatures for return types
 --> src/lib.rs:1:31
  |
1 | fn evens_squared(n: usize) -> _ {
  |                               ^
  |                               |
  |                               not allowed in type signatures
  |                               help: replace with an appropriate return type: `impl Iterator<Item = usize>`

https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=c36b48aedbf2651c7b2cfc4fc42e1070

4 Likes

Isn't this a bug? I don't see why a coercion to a function pointer should remove the higher-rankedness again? I feel like maybe the inferrence mechanism to determine that a local life time is meant happens in the wrong place (and should only apply to the definition of the closure and not during the coercion).

(Sorry, I can't really explain what I mean, and to be honest, I barely understand what's happening there.)

I mean the inference that is explained here:

I understand the compiler needs to treat something like |x: &str| vec.push(x) as not higher-ranked, but why does the compiler need to apply such assumptions when I do a type coercion?

I don't think it's a bug per se. The point of a coercion is to change a type. It'd be nice if inference did what we want all the time, but apparently that's an undecidable problem. If rustc defaulted to the higher-ranked direction, all the posts in this forum about closures not ending up higher-ranked would probably be replaced by posts about closures not borrowing conservatively.

I think my point is: Why does coercion need to imply either higher-rankedness or a particular lifetime? Couldn't this be dependent on the original value being coerced? Afterall, I wrote "_" for the type of the function argument in as fn(_) -> _.

Or should the resulting type of a coercion never depend on the input type (i.e. left hand side of as)?

I didn't mean it had to be one or the other. If inference could be smarter about these cases without regressing a bunch of others or significantly slowing down compilation, I imagine everyone would be happy.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.