Strange type inference around FnOnce & Fn

The following is a reduced reproducer: Rust Playground

use std::io::Read;

pub trait HttpClient {
    fn get(&self, url: &str) -> Result<Box<dyn Read>, std::io::Error>;
}

struct DummyHttpClient<T> {
    responder: T,
}

impl<T> HttpClient for DummyHttpClient<T>
where
    T: Fn(&str) -> Result<Box<dyn Read>, std::io::Error>,
{
    fn get(&self, url: &str) -> Result<Box<dyn Read>, std::io::Error> {
        (self.responder)(url)
    }
}

struct UserBuilder {
    client: Option<Box<dyn HttpClient>>,
}

impl UserBuilder {
    pub fn new() -> Self {
        Self { client: None }
    }

    /// Set the HTTP client to use for requests.
    pub fn http_client(mut self, client: Box<dyn HttpClient>) -> Self {
        self.client = Some(client);
        self
    }
}

fn test_custom_http_client() {
    let http_client = Box::new(DummyHttpClient {
        responder: |url| {
            Err(std::io::Error::new(
                std::io::ErrorKind::Other,
                format!("DummyHttpClient cannot fetch {url}"),
            ))
        },
    });
    let client = UserBuilder::new().http_client(http_client);
}

There are build errors that FnOnce and Fn are not general enough:

error: implementation of `FnOnce` is not general enough
  --> src/lib.rs:45:49
   |
45 |     let client = UserBuilder::new().http_client(http_client);
   |                                                 ^^^^^^^^^^^ implementation of `FnOnce` is not general enough
   |
   = note: closure with signature `fn(&'2 str) -> Result<Box<dyn std::io::Read>, std::io::Error>` must implement `FnOnce<(&'1 str,)>`, for any lifetime `'1`...
   = note: ...but it actually implements `FnOnce<(&'2 str,)>`, for some specific lifetime `'2`

error: implementation of `Fn` is not general enough
  --> src/lib.rs:45:49
   |
45 |     let client = UserBuilder::new().http_client(http_client);
   |                                                 ^^^^^^^^^^^ implementation of `Fn` is not general enough
   |
   = note: closure with signature `fn(&'2 str) -> Result<Box<dyn std::io::Read>, std::io::Error>` must implement `Fn<(&'1 str,)>`, for any lifetime `'1`...
   = note: ...but it actually implements `Fn<(&'2 str,)>`, for some specific lifetime `'2`

Adding

  where
    T: Fn(&str) -> Result<Box<dyn Read>, std::io::Error>,

to the struct makes the issue go away: Rust Playground

Why is this? As I understand it, best practise is to only add bounds to impls that need it in rust. And here only the impl of HttpClient should need the bound. Yet that is not the case here?

This seems to be another case of a well-known problem of type inference for closure arguments. In the reduced case, it's actually enough to annotate the closure's argument type for the error to go away:

fn test_custom_http_client() {
    let http_client = Box::new(DummyHttpClient {
        responder: |url: &str| { // added `: &str`
            Err(std::io::Error::new(
                std::io::ErrorKind::Other,
                format!("DummyHttpClient cannot fetch {url}"),
            ))
        },
    });
    let client = UserBuilder::new().http_client(http_client);
}

As far as I can understand, the problem is that when argument type for closure is inferred, as specified in an error message, it is inferred to be a specific type, not a higher-ranked type:

...but when closure is explicitly stated to be |url: &str| {...}, this is treated the same as an elided lifetime on a function, i.e. as a generic lifetime, which translates into the higher-ranked type implementing for<'a> Fn<(&'a str,)>.

5 Likes

Thanks! I wasn't aware of that (supposedly) well known issue.

I believe this could use a much better error message, with a suggestion for how to fix it (assuming it is difficult to fix the actual inference issue). I'll file a diagnostics issue in the rustc bug tracker.

EDIT: Confusing diagnostic for closures · Issue #147264 · rust-lang/rust · GitHub

1 Like

Interestingly, the issue isn't just about closures: fn pointers also cause issues. The inference algorithm seems to always pick a specific lifetime when assigning a type to a type parameter unless you explicitly override/assist/defer it:

struct Foo<T>(T);
impl Foo<for<'a> fn(&'a str)> {
    fn get(&self, url: &str) {
        (self.0)(url)
    }
    fn new(f: for<'b> fn(&'b str)) -> Self {
        Self(f)
    }
}
fn bar(_: &str) {}
fn test_foo() {
    // If you uncomment below and comment out the other
    // `let foo` line, the code will not compile.
    // let foo = Foo(bar);
    let foo = Foo::new(bar);
    foo.get("");
}

Addendum

Even adding the bound on the type's definition doesn't necessarily solve the problem:

struct Foo<T: for<'a> Fn(&'a str)>(T);
impl<T: for<'a> Fn(&'a str)> Foo<T> {
    fn get(&self, url: &str) {
        (self.0)(url)
    }
}
impl Foo<for<'a> fn(&'a str)> {
    fn get2(&self, url: &str) {
        (self.0)(url)
    }
}
fn bar(_: &str) {}
fn test_foo() {
    Foo(bar).get("");
    // Uncommenting below will cause a compilation error.
    //Foo(bar).get2("");
}

While I still think having fewer bounds at the type definition site is "best practice", like any practice there will be exceptions. A benefit of having bounds at the type definition site is it allows the actual constructor to access more information assisting the inference algorithm; however as we see above, the compiler won't assume more than necessary. Thus our impl for the concrete type Foo<for<'a> fn(&'a str)> won't be "found" since our constructor only assumes the type of bar implements for<'a> Fn(&'a str)>.

Just adding a note that, unfortunately, there is no inline syntax to get elided lifetimes in the return position to be treated the same as those on a function, which causes the use of "funnels" when possible...

...and when not possible due to unnameable types, you end up with forced type erasure or other workarounds (prevalent in the async ecosystem).

And yes it's a well known issue.

I'm not sure if that will every be stabilized. I suspect some form of impl Trait in bindings will take its place. (Relevant PR.) Ironically this mirrors some pre-1.0 ascription syntax which was thrown out, IIRC...

Neither the RFC or trait ascription will solve the problems with unnameable types though, so some way to fix or override inference without naming types is still desired.

1 Like

That's not what happened in the examples. It chose the unnameable type of the function item bar, instead of the function pointer type for<'a> fn(&'a str).

error[E0599]: no method named `get` found for struct `Foo<for<'a> fn(&'a str) {bar}>` in the current scope

The function item bar type does meet the bound, but you only implemented get2 for the function pointer type, not everything that meets the bound.

The relevant PR link is broken, not clickable.

Fixing the underlying issue would be great, but sounds like it will take time. Until then, the diagnostics should really be improved, which should hopefully be easier? Rust generally has great diagnostics and it is extra jarring when you run into bad diagnostics thus. (We aren't coding C++ here, standards are higher.)

I didn't know bar had an unnameable type. I thought the type was for<'a> fn(&'a str). Interesting.

Of course bar meets the bound—the compiler would have rejected the impl if not. As you pointed out above, my mental model was simply wrong that bar had a single type for<'a> fn(&'a str).

1 Like

Thanks, fixed.

I agree, but apparently it isn't easy... the shortcoming has been around forever.

Every function has its own type, yes. They can be coerced to function pointers, but since coercions occur before inference (and partially guide this inference), they happen only when the target type is somehow explicitly required.

Yeah, I always thought the type of a function was a function pointer and that coercion or subtyping happened after. I'm quite blown away by this information, lol. I guess it’s similar to how something like 0 doesn’t haven’t a type until it’s used in a way that determines its type; and if not, falls back to an i32. Is that the case for functions? They have a default fallback of a function pointer?

They have a unique type (or type constructor) from the start. Generic parameters may be ambiguous or inferred, but the "outside type" isn't, so it's not like integers. They can coerce to function pointers, but won't without a reason to, like most coercions. It's not a fallback.

Function items being unique types allows them to be zero-sized and easier to optimize -- e.g. inlining through a function pointer is a form of devirtualization.

So what are the types of f, g, and h below?

fn main() {
    let f = foo;
    let g = bar;
    let h = fizz::<()>;
}
fn foo<'a>() {}
fn bar() {}
fn fizz<T>() {}

I assumed they all had types for<'a> fn(), but that would seem to require function pointer fallback.

They have types fn() {foo}, fn() {bar}, fn() {fizz::<()>}, respectively, as reported by error here. To have a for<'a>, you must have an actual lifetime-generic argument, such as reference:

fn main() {
    let f = foo;
    let () = f; // error: expected fn item `for<'a> fn(&'a ()) {foo}`
}
fn foo<'a>(_: &'a ()) {}

...but there is still this {foo}, as a marker of unique function type.

2 Likes

The types aren't nameable, ala closures. I.e. you can't use the syntax in errors in your code.

It is at least sometimes unsound to have unconstrained lifetimes in for<..> binders. And some signatures have more than one (or two or ten) lifetimes, so there's no binder that would cover all possible lifetimes anyway. So it wouldn't work for everything to be for<'a> ....

Let's consider fizz for a moment. The function item type is parameterized by the generic T. We might have for<T> ... higher ranked types some day, but no time soon I imagine.

Do they ever get parameterized by lifetimes? Yes, the distinction is "early bound" (parameterized) or "late bound" (not parameterized). Late bound corresponds to for<..> types and the late bound lifetimes cannot be turbofished. The lifetime in foo is unconstrained and thus early bound.

Lifetimes that appear in where bounds are also early bound. Other unconstrained mentions, like lifetimes only in GAT parameters or the return type, are also early bound.

The early/late split can lead to confusing errors.


Related:


Sorry if sloppy, written on mobile.

1 Like

I knew about "early-bound"/"late-bound" to an extent, but those links are insightful as there is quite a bit I wasn't aware of. Are you sure the lifetime in foo is early bound though? If I try to pass in a lifetime argument (e.g., 'static), the compiler complains since it claims the lifetime is late bound. Adding a where bound like fn foo<'a: 'a>() {} does make it early bound as expected though as does changing the signature such that the return type depends on 'a: fn foo<'a>() -> &'a str { "" }.

1 Like

I guess it is late bound, even though it's not used in any implementations either. And you can throw as many unused lifetimes in front of your for<..> higher-ranked types as you want, but they don't change the type (fn(), for<'a> fn(), and for<'a, 'b, 'c> fn() are all the same type).

1 Like