Rewriting loops containing `?` as chains of iterators

The following function uses guinea-pig functions which fall into two categories:

  • risky(n) takes Success and returns `Result<Success, Failure>.
  • safe_{a,b,c} take Success and return Success

The function is implemented with a loop containing a number of steps made up of the above functions, and uses ? to ensure early exit as soon as Failure appears.

fn doit_loop<I>(i: I) -> Result<Vec<Success>, Failure>
where
    I: Iterator<Item = Success>,
{
    let mut v = Vec::new();
    for x in i {
        let r = risky(1)(x)?;
        let a = safe_a(r);
        let b = safe_b(a);
        let r = risky(2)(b)?;
        let c = safe_c(r);
        v.push(c);
    }
    Ok(v)
}

Can this loop be replaced by an iterator chain?

In other words, is it possible to write something in the spirit of this

i.map(risky(1))?
    .map(safe_a)
    .map(safe_b)
    .map(risky(2))?
    .map(safe_c)
    .collect::<Vec<_>>()

?

Obviously, it cannot work as it stands: this is merely meant to show the approximate shape of desired code; the details are obviously wrong.

Rust's (?, Result, Option) are related to Haskell's (traverse, Either, Maybe). As the latter two are monads, and traverse returns a superclass of Monad, the full power of Haskell's monadic composition utilities is available, which makes it easy to create early-failing chains of functions such as risky(n) and safe_*. Is it possible to write these kinds of chains in Rust without making the code significantly uglier than the loop version shown above?

playground and for comparison, a haskell version.

I don't have the answer but share the issue. I found in rust writing one way or another is more about functionality than taste. If you are chaining iterators you need to forward the error in the end and finally call ?

Remember that you can convert a Vec of Result into a result of Vec, could be useful:

let l1: Vec<Result<(),()>> = vec![];
let l2: Result<Vec<()>, ()> = l1.into_iter().collect();

You can use fallible-iterator:

use fallible_iterator::FallibleIterator;

fn doit<I>(i: I) -> Result<Vec<Success>, Failure>
where
    I: Iterator<Item = Success>,
{
    let iter = fallible_iterator::convert(i.map(Ok));

    iter.map(risky(1))
        .map(|x| Ok(safe_a(x)))
        .map(|x| Ok(safe_b(x)))
        .map(risky(2))
        .map(|x| Ok(safe_c(x)))
        .collect()
}

Playground

5 Likes

If you wanted conditional early break for iterator, Iterator::scan() is your friend.

fn parse_iter<'a, I>(iter: I) -> Result<Vec<i32>, Box<dyn Error>>
where
    I: IntoIterator<Item = &'a str>,
{
    iter.into_iter()
        .map(|elem| -> Result<i32, Box<dyn Error>> {
            let v = elem.parse::<i32>()?;
            Ok(v)
        })
        .scan(false, |prev_was_error, elem| {
            if *prev_was_error {
                // If the previous element is an error,
                // emit no more elements.
                return None;
            }
            *prev_was_error = elem.is_err();
            Some(elem)
        })
        .collect::<Result<Vec<i32>, _>>()
}
  • .map(): Put your loop body here. Return Result<_, _> from the closure for each element.
  • .scan(): This does two job: Modifying elements using the state from previous iteration (the first parameter of the closure), and control when to terminate iteration (by returning None).
1 Like

This looks pretty good, thanks.

Inside the arguments to map when calling the safe functions, the signal is now obscured by some boilerplate noise:

|x| Ok(SIGNAL(x))

that's 10 characters of non-whitespace noise per call.

Any suggestions on how to eliminate this noise?

  • A macro seems problematic at this point
  • An iterator adapter (equivalent to the Haskell one-liner mymap f = map (\x -> Ok (f k)) would do the job, but in Rust this requires a new type and its implementation for Iterator which seems like a lot of work for so little gain.

Am I missing some alternatives?

Or is it simply not worth the hassle in Rust?

fn chain_ok<T, Ok, Err, F : FnMut(T) -> Ok> (mut f: F)
  -> impl FnMut(T) -> Result<Ok, Err>
{
    move |it| Ok(f(it))
}

and then you can do chain_ok(signal)

IIUC, this allows me to replace .map(|x| Ok(signal(x))) with .map(chain_ok(signal)).

I was trying to get .okmap(signal).

Edit: And the Haskell mymap does the job of the hypothetical Rust .okmap.

It's a valid point. Ideally, the FallibleIterator trait should have provided two separate methods for fallible and infallible mappers (e.g. map and map_ok, akin to TryStreamExt::map_ok from the async world).

A free-function version of the missing adapter can be done like this:

fn map_ok<I, F, B>(iter: I, mut f: F) 
    -> impl FallibleIterator<Item = B, Error = I::Error>
where
    I: FallibleIterator,
    F: FnMut(I::Item) -> B,
{
    iter.map(move |x| Ok(f(x)))
}

And then it could be used like this:

fn doit_loop<I>(i: I) -> Result<Vec<Success>, Failure>
where
    I: Iterator<Item = Success>,
{
    let iter = fallible_iterator::convert(i.map(Ok));
    let i = iter.map(risky(1));
    let i = map_ok(i, safe_a);
    let i = map_ok(i, safe_b);
    let i = i.map(risky(2));
    let i = map_ok(i, safe_c);
    i.collect()
}

But it's clearly underwhelming because you can't chain the calls. Usually you would be able to easily add an extra method using an extension trait, but in this case the return type is generic, so it's harder than it should be because you would need to create a struct similar to fallible_iterator::Map that does what you want. Basically you would need to copy and tweak the fallible_iterator::Map and its trait implementations.

@jacg Here's a slightly uglier version using an extension crate that does allow you to chain the calls, while also getting rid of the boilerplate.

1 Like

My bad, I misread the haskell one-liner.

There are two things going on:

  1. a custom iterator adaptor as a function:

    fn ok_map<T, Ok, Err> (
        mut f: impl FnMut(T) -> Ok,
        iter: impl Iterator<Item = T>,
    ) -> impl Iterator<Item = Result<Ok, Err>>
    {
        iter.map(|it| Ok(f(it)))
    }
    
  2. Being able to use that as a method, for easier chaining: .ok_map()

    That requires an extension method:

    trait OkMap<'lt, T, Ok, Err> : 'lt {
        fn ok_map (self, f: impl 'lt + FnMut(T) -> Ok)
          -> Box<dyn 'lt + Iterator<Item = Result<Ok, Err>>>
        ;
    }
    impl<'lt, T, Ok, Err, I : 'lt + Iterator<Item = T>>
        OkMap<'lt, T, Ok, Err> for I
    {
        fn ok_map (self, mut f: impl 'lt + FnMut(T) -> Ok)
          -> Box<dyn 'lt + Iterator<Item = Result<Ok, Err>>>
        {
            Box::new(self.map(move |it| Ok(f(it))))
        }
    }
    

    or, if you wish to write many of these ad-hoc helper methods, a macro can come in handy:

    extension_method! {
        fn ok_map<'lt, T, Ok, Err> (
            self,
            f: impl 'lt + FnMut(T) -> Ok,
        ) -> Box<dyn 'lt + Iterator<Item = Result<Ok, Err>>>
        where {
            Self : Sized,
            Self : 'lt + Iterator<Item = T>,
        }
        {
            let mut f = f; // a limitation of my basic demo macro
            Box::new(self.map(move |it| Ok(f(it))))
        }
    }
    
1 Like

Web search turns up nothing for an extension_method macro. Is it documented?

I manually implemented a naΓ―ve version of it in the linked playground:

#[macro_export]
macro_rules! extension_method {(
    $( #[$attr:meta] )*
    $pub:vis
    fn $fname:ident
        $(< $( $($gl:lifetime),+ $(,)? )? $($T:ident),* $(,)? >)?
    (
        $(
            &
            $($lifetime:lifetime)?
            $(
                mut $(@$mut:tt)? // <- sadly this one doesn't work, so to get it working it would require a copy-paste of the rule (with, and without `mut`)
            )?
        )?
        $self:ident
        $(, $arg_name:ident : $ArgTy: ty)*
        $(,)?
    ) $(-> $RetTy:ty)?
    $(
        where { $($where_clauses:tt)* }
    )?
    $body:block
) => (
    #[allow(nonstandard_style)]
    $pub
    trait $fname $(< $($($gl ,)+)? $($T ,)* >)?
    $(
        where
            $($where_clauses)*
    )?
    {
        fn $fname (
            $(
                & $($lifetime)?
                $(mut $($mut)?)?
            )?
            $self,
            $( $arg_name: $ArgTy ,)*
        ) $(-> $RetTy)?
        $body
    }
    
    impl<$($($($gl ,)+)? $($T ,)*)? __Self : ?Sized>
        $fname
        $(< $($($gl ,)+)? $($T ,)* >)?
    for
        __Self
    $(
        where
            $($where_clauses)*
    )?
    {}
)}

Thank you all for your suggestions.

I have a few questions. I hope that sticking them all in one post doesn't make it too heavy.

1.

Both @H2CO3's

(where I assume "crate" is a typo for "trait")

and @Yandros'

variations on the theme of an extension method, use Box in the return type.

Why is this? Is it needed to get sufficient polymorphism?

2.

FnMut appears in the return type. Is this because the closure needs to be called more than once, and beyond that we don't (in general) care whether it captures mutably or immutably, so we just pick the least restrictive option that works?

Should I be able to do some a priori reasoning to select between Fn, FnMut and FnOnce or just go down the list until the compiler is happy?

3.

Similar question about lifetimes: could someone please walk me through the reasoning about why the lifetimes appear where they do? I think I get it, but I'm not sure I'd be able to synthesize it myself, or that my reasoning is the clearest or completely correct.

I'd like to be able to do better than throw gunk at the compiler until something sticks.

4.

I'm stumped by @Yandros' use of copied in the main in the playground. More generally, I don't see how this solution integrates smoothly with the desire to write something like

    fallible_iterator::convert(i.map(Ok))
        .map(risky(1))
        .ok_map(safe_a)
        .ok_map(safe_b)
        .map(risky(2))
        .ok_map(safe_c)
        .collect()

5.

I wrote some criterion benchmarks for comparing

  1. a:loop: the original loop version
  2. b:fallible: the fallible_iterator version with its noisy map
  3. c:sub: @H2CO3's extension
  4. d:free: @Yandros' extension with return type modified to match 3.

The benchmarks use the chain shown just above in question 4 (adapted, as appropriate, to each implementation), with three different inputs:

  1. -1..10 which fails on risky(1)
  2. 1..10 which fails on risky(2)
  3. 10..20 which completes sucessfully.

The results look approximately like this

This suggests that

  • The loop is faster than the iterators in the cases that fail (the two violins at top-left (85 & 105 ns), as compared to the column of violins around 130ns. Seems reasonable: The iterator needs to be converted to a fallible one, and then we terminate almost immediately. The loop hits the failure without wasting time on converting iterator. But I was kinda hoping that Rust might compile this away.

  • When the process completes successfully, the plain fallible iterator (200ns) is faster than the loop (230ns), but the extended versions (250ns) are slower?

    • I'ts pleasing to see that the iterator is no slower than the loop, but it's surprising that it's faster. Why might this be?
    • What is slowing down the extended versions? Could it be indirection via the Box?

(I know, I know, I should profile. I've not had to profile (certainly at this low level) for a while, so it will take me a while to get up to speed on the available toolchains.)

6.

Criterion frequently reports speedups or slowdowns approaching 20% on subsequent runs of the same functions. For example, here are the changes for the above 12 combinations, on my last run:

change: [+3.1392% +3.6718% +4.2185%] (p = 0.00 < 0.05)
change: [-8.9813% -7.7446% -6.6396%] (p = 0.00 < 0.05)
change: [+2.2231% +2.9847% +3.9550%] (p = 0.00 < 0.05)
change: [+18.578% +19.488% +20.453%] (p = 0.00 < 0.05)
change: [+1.8321% +2.4816% +3.0972%] (p = 0.00 < 0.05)
change: [+20.519% +21.060% +21.606%] (p = 0.00 < 0.05)
change: [-1.9653% -0.7159% +0.4839%] (p = 0.26 > 0.05)
change: [+16.121% +16.767% +17.352%] (p = 0.00 < 0.05)
change: [-4.4787% -3.7702% -2.9555%] (p = 0.00 < 0.05)
change: [-0.4449% +0.2129% +0.8429%] (p = 0.54 > 0.05)
change: [-13.690% -13.237% -12.731%] (p = 0.00 < 0.05)
change: [-0.1720% +0.4150% +1.0309%] (p = 0.20 > 0.05)

NB. the implementations have not been modified since the previous run, so anything other than 0% change is spurious statistical fluctuation.

I'm not going out of my way to shut down all other programs and services on my machine for the purpose of running these benchmarks, but, even so, given how much sampling criterion does, I find these fluctuations to be absurdly high: criterion assigns a p-value of 0.00 for regression or improvement in 9 out of 12 implementations which HAVE NOT CHANGED AT ALL!

What am I missing?

[The conclusions in question 5 are not affected by these fluctuations.]

Apologies

Sorry about this colossus of a post. There are so many things that emerged from this, that I'm not sure how best to deal with them.

No, it's because you can't use impl Trait in the return type of a trait method. (I don't know why that restriction exist, I'd definitely like to be able to use impl Trait in trait methods, but I'm not sure if it's a technical limitation and whether it can be overcome.) So, ideally, that ugly return type should just be another -> impl Iterator<Item=…>, but alas, it currently can't.

Correct. :sweat_smile:

Correct.

Just pick the least restrictive option your implementation can support. So try FnOnce, FnMut, and Fn in this order.

The explicit lifetimes in my implementation are there because in Box<dyn Trait>, the default lifetime is Box<dyn Trait + 'static>. I didn't want to restrict the callback to 'static functions/closures, hence the explicit lifetimes.

.copied() is used there because [T; N].iter() is Iterator<Item=&T>, and for the sake of example, it was easier to get rid of the level of indirection, since the array elements are Copy.

TBH, I wouldn't consider 200 vs 230 vs 250 ns statistically significant in such a microbenchmark. Although you should be able to look at the generated assembly (e.g. https://rust.godbolt.org/) and see if there's any difference.

I'd say that you're missing the fact that a P-value is not a useful measure of effect size. :slight_smile: And/or these P-values are approximations, and might not be very good approximations.

And/or that there can be legitimate fluctuation in the execution of the very same code, depending on the mood of the CPU – as Jim Keller, the prominent chip designer once said: some given assembly is never executed in the same way twice.

1 Like

@H2CO3 has addressed most of your questions :ok_hand: , but I'll nevertheless try to further clarify a few points :slightly_smiling_face:

  • The first remark is that Rust makes these kind of things explicit, which is great. We should always keep in mind that in functional languages such as Haskell, things are heap-allocated almost systematically, and in practice that's ok, so that's also the case for Rust :slightly_smiling_face:

    • Their return type would be something like Rc<dyn 'static + Fn(...) -> _>, or Arc<dyn 'static + Send + Sync + Fn(...) -> _>.
  • That being said, yes, it irks a bit to suddenly not be able to return our closure directly.

    The reason for that is that, type-level-wise, we need to be strict about such existential type:

    Ok, the return type of that function for each impl block must meet that trait bound (in this case, be callable yadda yadda). But does that mean that all the impl blocks need to use the same return type? Or can each impl block feature an independent opaque return type, provided the trait bound is met?

    In other words, the order of quantifiers is ambiguous:

    1. We have a property that must be met for all implementors of our (extension) trait:

      βˆ€ Implementor : Trait

    2. Within an impl block, there exists some return type that meets the required trait bound:

      βˆƒ Ret : FnMut...

    3. So the question is:

      • can Ret depend on Implementor?

      • i.e., is it βˆ€ Implementor βˆƒ Ret or βˆƒ Ret βˆ€ Implementor ?

    And the thing is, that "just" writing -> impl ... is ambiguous in that regard. And depending on the people / use cases, the requirements may differ: some people may need independent impls (R "after" / depends on T), others, such as ourselves in this instance, can get away with the same type (R "before" T), which is always pleasant. So there was no "good" choice to make there, and Rust has made the cautious call not to favor any of those two scenarios (imho, they've done the right thing).

    In a likely future, we will finally get the language support to express this kind of existential types in a more fine-grained manner (allowing us to lift this requirement):

    #![feature(type_alias_impl_trait)]
    
    trait AddCurryPlease {
        type Ret : Fn(i32) -> i32; // each implementor gets to provide their own
    
        fn add (x: i32) -> Self::Ret;
    }
    
    enum Foo {}
    impl AddCurryPlease for Foo {
        // _one_ existential type for the `Foo` implementor
        type Ret = impl Fn(i32) -> i32;    // <--+
                                           //    |
        fn add (x: i32) -> Self::Ret       //    | This instance
        {                                  //    | sets the
            move |y: i32| -> i32 { x + y } // ---+ actual type
        }
    }
    

    vs.

    #![feature(type_alias_impl_trait)]
    
    /// Define a **single** existential type.
    #[allow(nonstandard_style)]
    type impl_Fn_i32_i32 = impl Fn(i32) -> i32;
    
    /// Provide its defining instance:
    fn add (x: i32) -> impl_Fn_i32_i32 // <-+
    {                       //              |
        move |y: i32| x + y // -------------+
    }
    
    pub
    trait AddCurryPlease {
        fn add (x: i32) -> impl_Fn_i32_i32
        {
            add(x)
        }
    }
    
    pub
    enum Foo {}
    impl AddCurryPlease for Foo {}
    

    As you can see, it requires quite a verbose incantation, but at least we get to express the right semantics without any ambiguity whatsoever :upside_down_face:

Being callable multiple times concurrently (Fn) implies being callable multiple times non-concurrently / exclusively (FnMut), which in turn implies being callable once (FnOnce):

  • Fn β‡’ FnMut β‡’ FnOnce, also written as Fn : FnMut : FnOnce.

So Fn is both the most powerful trait bound, and the most constraining one.

  • When returning stuff, you want to return a maximally usable thing, so you should strive to return Fn if possible, and fallback to FnMut() if not, and so on.

  • When taking stuff, you want to take a minimally constrained thing, so you should strive to take an FnOnce if possible, and fallback to a FnMut() if not, and so on.

In our case, the closure was in input position, hence FnOnce() was the priority, but of course we need for it to be callable multiple times, hence falling back to an FnMut().

  • Tip: requiring Fn() alone in input position is uncommon, since requiring it means we are dealing with concurrent / shared accesses, and yet we are not requiring the extra traits that also enable the parallelism version of concurrency (e.g., Sync). That is, it would be required for single-threaded concurrency, such as re-entrancy, which is not that common.

TBD :nerd_face:

4 Likes

Crystal clear, thanks!

So, with #![feature(type_alias_impl_trait)] we have

type T =      Constraint; // Each impl can satisfy Constraint in its own way
type T = impl Constraint; // All impls must agree on single T

? [Edit: Nope! I seem to have missed the point in the syntax comparison.]

Is there some reasoning behind why it's this way around and not the other? Ah, OK, I think I've figured it out, below ... hmm, maybe not ...

That is to say, even though the closures are identical, the compiler generates a different type to implement the closure, each time you write one out somewhere in the source code?

Summarizing: a return type of impl FnWhatever in a trait would require all implementors to agree on a specific type that meets that constraint, which is impossible given the multitude of closure types. Consequently,

  • my suggestion in my parent question that Box was needed for "sufficient polymorphism" was essentially correct,
  • the answer to my question about the rationale of "which way around", just above, is: because impl Constraint is what we already have, so the new behaviour goes with the new syntax.

Hmm ...but this interpretation doesn't seem to agree with your statement that

... which implies that my second bullet point in the summary is wrong. So, today, how can we write that all impls must agree on a single type?

Aside: I appreciate that the type of each and every closure is different, even if they have the same signature, but how about functions? Do all functions with the same signature share the same type? I guess they should. [Edit: OK, I think that your use of the free add function in the single-implementation-type example, answers this in the affirmative.]

Let me have another stab: if the type alias is inside a trait, each implementor can choose its own version; if the alias is outside, then it's a single alias and everyone must agree.

If so, why does impl appear in the free version ... and how is it different from a plain type alias?

Clearly I'm missing something.

1 Like

With the impl syntax, the compiler ensures that users only access trait methods, and not inherent ones, for that type. This frees library authors to change their implementation to use a different type without breaking user code, as long as the new type implements all of the traits that were declared in the first place.

2 Likes

Read up about internal vs external iteration. I think this is the canonical post:

TL/DR: Sometimes iterator methods can take advantage of the structure of what the iterator is doing to be easier on LLVM that the corresponding for loop over the same thing, and sometimes that means LLVM makes the code better than it could for the for loop.

I'll add one thing to this: I think it should be common when you're not promising the order of the calls.

FnMut is implicit encouragement to use a mutable closure, and thus an implicit statement that doing something like "append the argument to a container every time it's called" is a reasonable thing to do. And for things with an obvious-and-predictable order in which it's called, that's fine. (See Vec::retain, for example.) But I wouldn't recommend FnMut for something like a predicate used for a binary search, because the exact order or number of elements that are looked at really isn't something that should be relied upon. (People can always use internal mutability in their Fn if they really want, but hopefully that will make it obvious to them that if it breaks they shouldn't be surprised.)

I guess another way to say that is that sometimes your use of the closure is conceptually concurrent, even if it isn't technically concurrent. In the same vein as the classic

My intuition is that code far away from my code might as well be in another thread , for all I can reason about what it will do to shared mutable state.

(quoted in https://manishearth.github.io/blog/2015/05/17/the-problem-with-shared-mutability/#its-effectively-threaded)

2 Likes

I've never seen this stated before, and I have to say, I don't agree with it.

But it can also make sense even if order is not guaranteed. There are iterators for which order isn't even really a meaningful concept because they're unordered by nature (e.g. HashMap), yet you might want to accumulate state by traversing them. For instance, you might want to write the elements to a stream. Or you may want to fill up some preallocated space efficiently from several iterators (in which case collect() wouldn't work). Or you might want to update a GUI step-by-step.

In these cases, requiring internal mutability only makes the life of programmers unnecessarily painful with no real benefits. Because, as it seems to me, it's actually more dangerous to encourage internal mutability than to accept an FnMut in the first place. Especially if this leads the programmer to reach for RefCell, and if s/he subsequently changes the underlying iterator to one that really shouldn't accept an FnMut, then this situation will result in a runtime panic instead of a compiler error.

1 Like