Rust fn types are unique

Some time ago I came across an article that explains how every Rust fn actually has its own, unique type.

(Coming from FP, this was surprising and unintuitive and would seem to make certain simple things difficult.)

I now cannot find that article. Google has failed me. Can someone point me to a decent treatise on this?

I know of this video which covers the topic:

https://youtu.be/SqT5YglW3qU

From the Reference:

the item types of different functions - different items, or the same item with different generics - are distinct
...
However, there is a coercion from function items to function pointers with the same signature, which is triggered ...

Yes, Rust exposes the machinery of functions and closures in ways that other languages typically conceal, at least in the type system. It's the Rust way, for better or worse.

Think of it this way—different closures capture different stuff from the environment, so they all contain different data, and are different sizes; some are cloneable and some are not, some have lifetimes and some don't. All these concepts are part of the type system in Rust. Different closures had to be different types.

This is not quite as true for fns, but for functional programming in Rust you want to support both fns and closures. It makes sense for them to behave the same way.

And it's worth noting what Rust gains from these types.

  • Users can choose between writing generic code <F: Fn(i32) -> i32> which will be compiled and optimized separately for every function type; or taking a function pointer argument (f: fn(i32) -> i32) or (f: &dyn Fn(i32) -> i32) which will do dynamic dispatch when it's called.
  • All the familiar higher-order functions from FP, like map and filter, are generic, so the compiler generates optimized code at every call site. Function call overhead is frequently optimized away.
  • Each unique fn type has only a single value, so these types are zero-size. When generic code creates, say, an iterator that contains a particular fn, no function pointer is actually stored in memory at run time.
  • Closures that capture closures don't result in a chain of heap allocations -- they're like nested structs in Rust.

The way structs are in Rust mostly explains the way closures are... and much else about the language.

4 Likes

Well, yes, because Rust is not a functional language, and trying to use it as one usually ends in misery.

But it's at the core of some of the Rust's zero-cost abstractions. For example, each closure has a unique type, even if two closures have an identical definition and captures. playground

let mut f = || ();
f = || ();  // error[E0308]: mismatched types
//  ^^^^^ expected closure, found a different closure
// note: expected closure `{closure@src/lib.rs:2:17: 2:19}`
//       found closure `{closure@src/lib.rs:3:13: 3:15}`

But this also means that generic functions which accept closures (e.g. various iterator adapters) can be freely monomorphized, the closures can be inlined, and entirely optimized away. If closures shared a type (e.g. if all non-capturing closures with the same body had the same type), then it would be much harder to explain to the optimizer why this often-used closure function should be inlined. There would be at least some possibility that instead a call to a shared function should be emitted, and then it likely wouldn't be inlined in many cases. At the very least it would make rustc's and the optimizer's job harder, because they'd have to figure out which functions should be inlined despite the failure of usual inlining heuristics. Can it cause code bloat? Yes. But in this case the functions are commonly small, and even for large closures the expected behaviour is zero-cost abstraction, where you can use the pretty iterator adapter syntax, while getting a fully optimized tight loop.

Similar considerations apply to proper function types. Another benefit is that having a unique type for each function means that function types are zero-sized. You don't actually need to store the pointer to the function at runtime, since the compiler already knows it at all times. This means that structures generic over function type (e.g. struct Foo<F: Fn(u32) -> u32>(F); ) can be reliably optimized based on the knowledge of the exact function used, and their size is smaller (Foo above is also zero-sized if F is). That's not true for function pointer types (fn(u32) -> u32). Since this type may accept any function with the given signature, there is no way but to store the actual function pointer at runtime, and to emit actual function calls whenever that pointer is used. Devirtualization may kick in and remove that overhead, but it's not something which you can rely on, and it is unlikely to work in more complex cases, or for cross-crate function calls.

An important special case of the above is Rust's trait system. A trait impl is basically a table of the relevant functions. Since the precise function types are known at compile time, the compiler has full visibility into the called methods for generic functions with trait bounds, and can apply the usual inlining and cross -function analysis techniques to optimize the monomorphized functions.

3 Likes