Why do these behave so differently? Is this unsound?

Given the following 2 examples and supporting code:

/// A predicate.
pub type Predicate = dyn (Fn(
    &mut dyn erased_serde::Deserializer<>
) -> bool) + Send + Sync;

/// Helper to build predicates because closure inference is the worst.
///
/// # Examples
///
/// This doesn't work:
///
/// ```rust compile_fail
/// use serde::Deserialize;
/// use datafu::Predicate;
///
/// let x = Box::new(|v| String::deserialize(v).is_ok()) as Box<Predicate>;
/// ```
///
/// But this does:
///
/// ```rust
/// use serde::Deserialize;
///
/// let x = datafu::pred(|v| String::deserialize(v).is_ok());
/// ```
pub fn pred<F>(f: F) -> Box<Predicate>
where
    F: (Fn(
        &mut dyn erased_serde::Deserializer<>
    ) -> bool) +  Send + Sync + 'static,
{
    Box::new(f)
}

Why are these so different?

How does it fail?

1 Like

Runnable playground version gives:

error: implementation of `FnOnce` is not general enough
  --> src/lib.rs:49:13
   |
49 |     let x = Box::new(|v| String::deserialize(v).is_ok()) as Box<Predicate>;
   |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ implementation of `FnOnce` is not general enough
   |
   = note: closure with signature `fn(&'3 mut dyn erased_serde::Deserializer<'_>) -> bool` must implement `FnOnce<(&'1 mut (dyn erased_serde::Deserializer<'_> + '1),)>`, for any two lifetimes `'1` and `'2`...
   = note: ...but it actually implements `FnOnce<(&'3 mut dyn erased_serde::Deserializer<'_>,)>`, for some specific lifetime `'3`

Note that type annotating x works too (fn quz in the playground). That gives an explicit target type similar to a function parameter. My take is that when you're asking the compiler to cast from type ?Unspecified to type Box<Predicate>; it first infers it's own type for ?Unspecified in the usual way. That is, the target of an explicit cast isn't influenced the way coercion sites like annotated let statements and function arguments are.

That's my guess anyway.

There's no unsafe here so there's no unsoundness.

2 Likes

so basically we'd want type ascription?

the "real" code (test code) that gave us trouble was:

let preds = vec![("dict", datafu::pred(|v| { todo!(); false }))].into_iter().collect();

I couldn't get ascription to influence through the IntoIter indirection, but this works:

    let preds: [(&'static str, Box<Predicate>); 1] = [ // or Vec<...>
        ("dict", Box::new(|v| { todo!(); false }))
    ];
    let preds: HashMap<_, _> = preds.into_iter().collect();

The "type ascription everywhere" RFC (perhaps in combination with #78248) is probably what it wish existed, but as far as I know that feature is stuck waiting for a champion.

1 Like

yes, we wish we had that. tho it'd be nice if the syntax were :<Type> instead of : Type.

(unrelated: we kinda/really wish ppl referred to us as "it" instead of "you" even here on urlo...)

1 Like

Explanation of the error

Before a |v| … kind of closure gets to become a Box<dyn Fn…>, it first has to be an impl Fn. And there is some nuance regarding which kind of Fn we have.

Given some inferred / arbitrary-but-fixed lifetime 'inferred, it's important to distinguish:

impl Fn(&'inferred Thing)

from

impl Fn(&Thing)
// =
impl Fn(&'_ Thing)
// =
impl for<'any> Fn(&'any Thing)

That is, to distinguish the fact the former (impl Fn(&'inferred …)) features a single "non-generic" trait bound, whereas these latter examples feature an infinity of trait bounds (≈ a trait-bound which is, itself, generic).

For what is worth, the latter encompasses the former:

  • T : for<'any> Fn(&'any Thing)
    

    is the same as:

    for<'any>
        T : Fn(&'any Thing)
    ,
    

    is kind of the same as:

    T : Fn(&'static Thing),
    …
    T : Fn(&'inferred Thing), // 👈
    …
    // and so on for all possible lifetimes
    

And the inverse is not true: it is not sufficient to be able to handle some arbitrary but fixed lifetime 'inferred to be able to handle all of them.

  • Demo:

    // this generates an `impl Fn(&'inferred str)` (see remainder of the post to know why)
    let f = |s| {
        let _: &str = s;
    };
    f(&String::from("…"));
    f(&String::from("…")); // Error, lifetime does not live long enough!
    

So, depending on certain things (which I'll try to detail below),

  • a closure can have this simpler and more limited signature involving an 'inferred-and-fixed lifetime[1],

  • or, in some fortunate cases, the closure can end up having the richer / more flexible for<'any>/HRTB/universal/higher-order lifetime[2] signatures.

    I personally call this situation / phenomenon imbuing the closure with a properly higher-order signature.

In the OP's case, the |v| … closure in the Box::new(|v| …) as Box<dyn Fn…> expression did not manage to hit a "fortunate case", so it ended up falling back to a signature with 'inferred-but-fixed / early-bound lifetimes, which did not meet / match the requirements of your Predicate alias:

type Predicate = dyn Fn(&'_ mut Deserializer<'_>) -> bool;
// i.e.
type Predicate = dyn for<'a, 'b> Fn(&'a mut Deserializer<'b>) -> bool;

hence the error.

When using the pred syntax, however, the OP did hit a "fortunate" case where the closure was imbued with a properly higher-order signature, thence managing to meet / match the higher-order trait bounds of the Predicate dyn trait.


I guess at this point the question then is: when do we get "higher-order signature promotion" and when don't we? So that we can apply it to dodge this kind of errors more systematically.

Higher-order promotion of closures: when and when not

First of all, what follows will apply to literal closure expressions: |…| ….

  • That is, in Rust, technically, a closure is (an instance of) a type which implements some FnOnce trait.

    We thus have:

    1. functions (or rather, function items), as in fn fname<…>(…), wherein fname[3] is an instance of a type which implements FnOnce / is a closure (with no captured environment);
    2. dynamic types implementing the Fn… traits: fn… pointers and dyn Fn… trait objects.
    3. literal closure expressions: |…| …
    4. [nightly-only] custom types for which we have hand-rolled implementations of the Fn… traits.

    Regarding whether the impl FnOnce is "generic" / involves late-bound / higher-order / for<…>-hrtb-quantified / universal lifetimes:

    1. functions with generic lifetime parameters will "often"[4] have their lifetimes be late-bound / higher-order;

    2. the signature of a fn… pointer / dyn Fn… trait object involves higher-order lifetimes if for<'lifetime> appears inside it, or if it features elided lifetimes ("lifetime elision rules for functions" apply, here).

    3. for literal closures, break see what follows;

    4. a manual / hand-rolled implementation of FnOnce will be "higher-order" over the lifetime parameters occuring in Args but which do not occur in Self / the implementor / or which would somehow be connected with Self through where clases.

      to illustrate
      /// This is a higher-order / "late-bound" signature:
      impl<'a> FnOnce<(&'a str, )> for MyType {
      //                ^^               ^
      //              does not appear in /
      
      /// This is *not* a higher / "late-bound" signature
      impl<'a> FnOnce<(&'a str, )> for MyOtherType<'a> {
      
      /// Nor this one
      impl<'a, 'b> FnOnce<(&'a str, )> for Foo<'b>
      where
          'a : 'b,
      

So, a |x| … is a literal closure expression; the compiler will generate its own dedicated / ad-hoc type definition (with the captured environment as its fields), and automagically implement the Fn… traits on it, based on certain heuristics.

And in our case, as we saw with the "hand-rolled FnOnce implementations" case, it's a matter of considering whether:

|s| { let _: &str = s; }

for instance, generates:

struct Closure;
impl<'any> FnOnce<(&'any str, )> for Closure {
  • higher-order / late-bound case;

or if it generates:

struct Closure<'inferred>;
impl<'inferred> FnOnce<(&'inferred str, )> for Closure<'inferred> {
  • early-bound / fallback case.

The rule

The rule driving this, to the best of my knowledge, is the following:

  • by default, everything is not higher-order;

  • but there are two heuristics thanks to which a closure may escape the shackles of 'inferred-ness, and reach higher-order emancipation:

    • explicit/visible lifetime placeholders in the closure args become higher-order

      That is, in something such as |s: &str|, that explicit &, which in turn carries an "explicit" '_ placeholder behind it, and similarly, |x: Cow<'_, str>|, etc. will yield closures signatures à la:

      impl for<'any> FnOnce(&'any str) -> …
      
      • But the return type, , does not benefit from this rule!

        That's why |s: &str| s almost always fails (the only time it doesn't is in the following case).

      An important counter-example is type inference: in Rust, we are mostly used to type inference making the exact location where we write the type annotation be unimportant. This is not the case for closure and higher-order promotion:

      let any = |s: &str| { … }; // <- any lifetime in input
      // is different than
      let fix1 = |s| { let _: &str = s; … }; // <- fixed lifetime in input!
      let fix2 = |s| { … }; h(some_str); // <- fixed lifetime in input!
      
      • Back to the OP, the Box::new(|v| …) as … case matched the fix2 situation, here.

    • a literal closure expression inlined in function argument position (it has to be inlined in there!), and when that function has a higher-order Fn… bound on the type of that argument, (or a bound involving a trait alias thereof), then the compiler will attempt to fully promote the closure's signature to said higher-order Fn…. This includes the return type.

      • Back to the OP, the pred function matched this heuristic.

  • On nightly, there is a feature to be able to write for<'lifetimes…> |…| -> … to force a closure signature to be higher-order. In practice, it's currently quite limited, since it then wants all the types to be fully spelled out (no type inference whatsoever).

    On stable, or if these constraints are too limiting, there is the following crate (of mine):

  • It's based on this idea of "full promotion" for function-arg-with-fn-bounds rule, and the pattern that stems from it: a "funelling function". In order to generalize over all possible signatures, a macro API is thus the one featured.

  1. for those interested / for reference / for more complex vocabulary but which you may see in some places, this is called an early-bound lifetime parameter ↩︎

  2. for those interested / for reference / for more complex vocabulary but which you may see in some places, this is called a late-bound lifetime parameter ↩︎

  3. once its early-bound generic parameters have been fed to it ↩︎

  4. it's complicated: as long as the generic lifetime parameter is not involved in certain trait bounds, it should be "unconstrained" and thus free to be higher-order. Otherwise it almost always becomes "constrained" and then early-bound. ↩︎

4 Likes