Reproduce iterator all and any

I'm facing a difficult issue and I would appreciate any help.
I'm designing a code generator from a language to Rust, and let's assume I need a design like the one below

What I want to achieve is a kind of "pool" from which we can quantify and return a boolean, but the quantifier all/any is given as a boolean. So for the same range, I am able to achieve this using my own quantifier method and the iter().all( in the test.

Since this trait is local, I'd love to take any iterable and be able to use my quantifier function on it. In this example, I'm trying hard to emulate that behavior, but the third assertion has two problems:

  • The type of the variables in the lambda has a double indirection (how to get rid of one indirection?)
  • I get a strange issue "temporary value dropped while borrowed" which makes no sense to me, given that the whole expression is returning a boolean and quantifier and all have the same self signature, except for the function to iterate.

My wish is to modify the design so that the last assertion compiles, and there is no double indirection.,

Link to the playground of the code below:

use std::{collections::HashSet, rc::Rc};

pub trait BoundedPool<T: 'static> {
    // Takes ownership of the function because otherwise it's hard to
    // generate code to create a function and borrow it immediately
    // without assigning it to a variable first
    fn quantifier(self: &mut Self, is_forall: bool, f: Rc<dyn Fn(&T) -> bool>) -> bool;
    fn forall(self: &mut Self, f: Rc<dyn Fn(&T) -> bool>) -> bool {
        self.quantifier(true, f)
    }
    fn exists(self: &mut Self, f: Rc<dyn Fn(&T) -> bool>) -> bool {
        self.quantifier(false, f)
    }
}

pub struct Range(pub HashSet<i32>);

impl Range {
    pub fn new(start: i32, end: i32) -> Self {
        let mut set = HashSet::new();
        for i in start..end {
            set.insert(i);
        }
        Self(set)
    }
    pub fn iter(&self) -> std::collections::hash_set::Iter<'_, i32> {
        self.0.iter()
    }
}

impl BoundedPool<i32> for Range {
    fn quantifier(self: &mut Self, is_forall: bool, f: Rc<dyn Fn(&i32) -> bool>) -> bool {
        for i in self.iter(){
            if !f(i) == is_forall {
                return !is_forall;
            }
        }
        is_forall
    }
}

impl <T:'static, W: 'static> BoundedPool<T> for W
    where W: Iterator<Item = T> {
    fn quantifier(mut self: &mut Self, is_forall: bool, f: Rc<dyn Fn(&T) -> bool>) -> bool {
        if is_forall {
            self.all(|x| f(&x))
        } else {
            self.any(|x| f(&x))
        }
    }
}

#[test]
fn test_difference() {
    assert!(Range::new(1, 7).quantifier(true, Rc::new(|i: &i32| *i < 8)));
    assert!(Range::new(1, 7).iter().all(Rc::new(|i: &i32| *i < 8).as_ref()));
    assert!(Range::new(1, 7).iter().quantifier(true, Rc::new(|i: &&i32| **i < 8)));
}

You are doing that to yourself: this is because you require the iterator and its item type to be 'static for some reason. That makes no sense.

The whole dance with dyn Fn smells, to be honest. There's no reason for dynamic dispatch; it's all very unusual-looking code.

Again, you are doing that explicitly, by forcing the predicate function to take a reference to the item type. Since HashSet::Iter already yields references, if you take a reference to those references, you quite obviously end up with double indirection.

You should just not force references at all, because:

  1. they are useless: the iterator is exhausted anyway, and the return value doesn't need the items to stick around (no wonder Iterator::{any, all} don't pass a reference to the callback, either); and
  2. the interface is generic, the T may stand for any type (so you can pass a reference if needed).

For the same reason, you should just get rid of the Rc<dyn Fn> and be generic over the callback function.

Finally, the code is overly complicated for no reason. It can be made much simpler, like this:

pub trait BoundedPool<T> {
    fn quantifier(&mut self, is_forall: bool, f: impl Fn(T) -> bool) -> bool;
    
    fn forall(&mut self, f: impl Fn(T) -> bool) -> bool {
        self.quantifier(true, f)
    }
    
    fn exists(&mut self, f: impl Fn(T) -> bool) -> bool {
        self.quantifier(false, f)
    }
}

pub struct Range(pub HashSet<i32>);

impl Range {
    pub fn new(start: i32, end: i32) -> Self {
        Self((start..end).collect())
    }
    
    pub fn iter(&self) -> std::collections::hash_set::Iter<'_, i32> {
        self.0.iter()
    }
}

impl BoundedPool<i32> for Range {
    fn quantifier(&mut self, is_forall: bool, f: impl Fn(i32) -> bool) -> bool {
        self.iter().copied().quantifier(is_forall, f)
    }
}

impl<I, T> BoundedPool<T> for I
where
    I: Iterator<Item = T>
{
    fn quantifier(&mut self, is_forall: bool, f: impl Fn(T) -> bool) -> bool {
        if is_forall {
            self.all(f)
        } else {
            self.any(f)
        }
    }
}

Oh, one more thing – it is very dubious why you are collecting into a HashSet just to iterate over it. You loose the ordering, you are storing a big consecutive chunks of numbers unnecessarily in memory, and you are slowing down iteration. Range does everything you need here, and more (e.g. O(1) containment test).

1 Like

Thanks for your response. I might not have emphasized it enough, but this is a simplified example of a bigger problem, and I want to assume it has to be solved that way. When you see something "unnecessarily complicated", for the sake of this question, just assume it's needed. It's for a code generator.anyway, no one is going to read that code.
Same for the fact I'm wrapping around a HashSet. If it were not for a more complex code generator, I would just emit a plain for loop with an integer range than doing all of this fuzz. The real structures are more complicated than the hashset presented here.
I'll look into your reply more deeply on Monday, thanks for the tip

TBF, that makes me very unmotivated to try to help.

2 Likes

Taken to the extreme this means we cannot change anything of the code that you posted, and thus the answer we can give is "no, what you want cannot be achieved, as you already have in front of you everything that you can possibly do with that code".

In practice there are probably some parts of that code that can be changed, but you're pretty much asking people to guess what those are! This is terrible, people won't know whether their solution will be acceptable to you (meaning it will be a time waste for them), they will have to work assuming they can change the minimum amount of code possible (meaning it will be harder to find a solution, which may still not be accepted!), and this will likely also result in a worse solution for you (as a better solution might exist which changes some part of the code that people assume you want to keep like that).

So do everyone a favour (you too!) and specify exactly what are the constraints of your problem.

8 Likes

Removing the 'static bounds without changing the rest allows the code to compile, whatever that's worth. Since nothing else changed, it doesn't get rid of the &&i32 parameter.

2 Likes

Hello everyone,
I want to start by apologizing if my initial post or follow-ups caused any confusion or frustration. I truly appreciate the time and effort you all put into reviewing and responding to my queries.

Thank you for the valuable feedback and insights. As suggested, I want to clarify a few points and hopefully make it easier to assist with the issues I'm encountering.

  1. The use of Rc<dyn Fn> is necessary because the functions in the generated code could be defined in various places, not directly within the same scope. Additionally, these function objects might need to be cloned, and since Rust functions themselves aren't cloneable, wrapping them in Rc allows this.

  2. The 'static bound is required to ensure that the types do not hold references, avoiding potential lifetime errors in more complex scenarios beyond this simplified example.

  3. I can definitely consider changing &T to T in the argument of the lambda/closure.

  4. I appreciated learning about the .copied() and .cloned() functions (thanks @paramagnetic). Though, @quinedot’s addition of '_ + next to the Fn keyword puzzles me, and I would love to understand the necessity and implications of this modification.

  5. With 'static, I'm still grappling with the error "creates a temporary value which is freed while still in use." Clarification on why this occurs even when the expression returns a boolean would be immensely helpful.

  6. Definition of Success: Here's a clearer definition of what I aim to achieve with the traits and test setup, under the form of a test that I would like to compile and run (and .iter() should return a built-in iterable that can be used in a for loop statement)

    #[test]
    fn test_difference() {
        assert!(Range::new(1, 7).quantifier(true, Rc::new(|i: &i32| *i < 8)));
        assert!(Range::new(1, 7).iter().quantifier(true, Rc::new(|i: &i32| *i < 8)));
    }
    

Despite the complexities, the constraints above are essential for the broader context of a code generator I am working on. I understand the concerns about potential inefficiencies, and I truly value your expertise and assistance in refining this approach.

Thank you all once again for your patience and help.

The relevant part of the error message is:

error[E0716]: temporary value dropped while borrowed
  --> src/lib.rs:57:13
   |
57 |     assert!(Range::new(1, 7).iter().quantifier(true, Rc::new(|i: &&i32| **i < 8)));
   |             ^^^^^^^^^^^^^^^^------------------------------------------------------ temporary value is freed at the end of this statement
   |             |
   |             creates a temporary value which is freed while still in use
   |             argument requires that borrow lasts for `'static`

Breaking down what happens in this line:

  • Range::new(1, 7) returns a new Range, which is stored in an unnamed position on the stack. Let's call it 𝒳.
  • .iter() takes a reference with lifetime '1 to 𝒳 and returns hash_set::Iter<'1, i32>. At this point, the exact duration of '1 is not known, just that the lifetime contained within Iter is the same as the one for the borrow of 𝒳.
  • .quantifier(...) is defined by the BoundedPool trait, and the relevant implementation contains the bound W: 'static.
  • There's only one choice of lifetime '1 that satisfies a :'static bound, and that is 'static itself. Therefore, '1'static.
  • Because '1 is also how long 𝒳 has been borrowed for, 𝒳 must remain alive and unmoved until the end of the program for this code to be valid.
  • But this doesn't happen because 𝒳 is a temporary local variable which is dropped at the end of the statement. Therefore, the compiler rejects this code and issues an error message instead.

My instincts say that the best course of action is to relax some of these 'static constraints and then fix the more complex scenarios. But if that isn't an option and you really do need all of these 'static bounds, you'll need to use HashSet::into_iter() instead of HashSet::iter() to create an iterator which satisfies 'static. This takes ownership of the set, though, so you'll need to clone() it first if you don't already own it:

    pub fn iter(&self) -> impl 'static + Iterator<Item=i32> {
        self.0.clone().into_iter()
    }

This is only necessary to create a 'static iterator; implementing BoundedPool for Range doesn't need this, so a more efficient reference-based approach can be used:

    fn quantifier(self: &mut Self, is_forall: bool, f: Rc<dyn Fn(&i32) -> bool>) -> bool {
        for i in self.0.iter() {
            if !f(i) == is_forall {
                return !is_forall;
            }
        }
        is_forall
    }
4 Likes

dyn Trait without any further context to imply the lifetime from is implicitly assumed to be dyn Trait + 'static (because there is simply no other concrete, nameable lifetime literal in the language). Explicitly adding the '_ lifetime introduces an (unnamed but) fresh lifetime parameter that allows the lifetime to be shorter than 'static.

That 'static bound has nothing to do with "returning a boolean". The lifetime is on the function type, which means that in the case of closures, it's the lifetime of the captures. Thus, if you want to support closures that capture arbitrary, short-living data, you have to get rid of the 'static bound; if you want the 'static bound, then you can't support arbitrary closures. You can't have both.

2 Likes

I'm not sure I actually understand what you're getting at, but Rust function items (and function pointers) implement Copy and Clone. Closures conditionally implement them (i.e. when possible).

They allowed the type-erased closures to be non-'static -- to capture references.

Your sample code doesn't require that, but it's more general. Hard to say which is more appropriate. If you're the only consumer of your trait, you can adjust as needed.

I can't think of a good reason to keep the 'static bound on the generics in your blanket implementation though (see next stanza).

It's due to these 'static bounds:

impl <T:'static, W: 'static> BoundedPool<T> for W

The only way for that to work with the current Range::iter would be if you could get a &'static Range in the call to Range::iter. (The only way to do that would be to leak the Range.)

"[Ensuring] the types do not hold references" is what's causing the borrow error.

I'm not sure why you thought the return type mattered.

4 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.