Why does Rust use pipes (`|`) to delimit closure parameters?

A "function" mathematically and according to those FP guys take parameters and always produces the same result output for a given parameter(s) inputs. The output does not depend on any global or not so global state outside of itself, not the time of day or any other I/O.

Perhaps I'm misunderstanding you but surely by this definition Rust's fn is also not a function? It can depend on global state, I/O, etc even in safe Rust.

6 Likes

But even Rust fns aren't – so why call them fn then? Something doesn't add up.


By the way, let's observe some Domain-Driven Design. In the bounded context of math, the word "function" means pure. But in the bounded context of programming, it doesn't, and this has been so for a couple of decades now.

2 Likes

Strict application of mathematical definitions to programming languages isn't going to make much sense. For example, computers can't have integer types, because computer storage is finite.

8 Likes

I meant if you can tell the difference between a function and a closure if I give you two functions and tell you that one is a function and one is a closure? Can you tell me which is which based on their behavior in a program?

I know that you can do

let f: () = the_function_name_I_gave_you;

and look at the compiler error :slight_smile: That is not my point — my point is that closures and "real" functions behave the same (I think?) and as such I don't see why we distinguish between them.

Yeah, that makes sense!

However, even with code line this:

  let mut funcs = Vec::new();
  for i in 0..3 {
      funcs.push(move |x| x+i);
  }
  
  for f in funcs {
      println!("f(10): {}", f(10));
  }

and I correct in thinking that the x + i code is compiled once? At runtime, there are 3 environments created, small structs which told the i integer so that f(10) can be evaluated later.

If true, the body of the closure feels very much like the body of a normal function. Calling a closure is different since it passes an implicit and hidden environment to the code in the body — but it's the same underlying mechanism at play, no? Also, seen through this lens, a real function is nothing but a closure with an empty environment.

Functions can be represented by a bare function pointer and thus can be passed across an FFI boundary in relative safety. If that function is extern, it can even be called safely by non-Rust code. Closures, on the other hand, can be arbitrarily sized, need to be boxed to cross an FFI boundary, and require a trampoline to be called from foreign code.

3 Likes

In some sense, closures are the more fundamental building block: Church’s lambda calculus, for example, relies on them heavily. In the context of building a programming language, though, functions are almost certainly the more primitive construct:

Before just about anything else, you need a way to define a named sequence of instructions that can be invoked from several places. You also need a way to layout a set of data fields so that they can be used as a unit.

Once you have those two basic pieces, you can start worrying about how to package together related instructions and data. There’s two main approaches:

  • Objects, where the data is primary and methods are semantically attached, or
  • Closures, where the operation is primary and the related data is encapsulated

Either one if these is capable of simulating the other, but most modern languages provide both because they’re useful in different situations.

7 Likes

Thanks, those are good points! Even if you could pass the underlying function from a closure to C, the C code would then have to provide the state for the closure and then you've basically lost the point of a closure.

One of them dynamically captures it's environment and the other doesn't. This is a difference in behavior.

The error given by the compiler isn't just a type mismatch error like let f: () = .... It's also not an artificial limitation that functions can't capture their environment. If closures and real functions did behave the same, both examples I have would compile.

You mentioned earlier that it seems like functions do capture their environment since you can use const items:

This is because when you use a const item that is in scope, it's as if you copy and paste that item into the body of your function. Each function using that const item receives it's own value that it can have a mutable binding to or do whatever with.
static items are similar to const items except the value you get is always a reference.

1 Like

There are some devils in the details, and their very different declaration styles lead to a different set of use cases. But in the general sense, especially in the "can I tell after it was created" sense, this seems like a fine way to think of things.

In particular, it fits well with the nit I mentioned in passing earlier: A closure which captures nothing (in your terms, a closure with an empty environment) can be coerced into a function pointer.

(To answer your actual question, while I would certainly expect the closure function body to only be compiled once, I'd also expect the compiler team to tell me that was an implementation detail that I couldn't count on to be true or not. But I also don't think it's germane to what you were actually getting at.)

1 Like

That is currently not the case, but fn definitions could unsugar to a const closures with no environment captured, see this playground.

But to be honest, that doesn't change the fact that a closure is an ad-hoc "object" that implements some callable interface / the Fn… traits, which means it has methods, and a method is nothing else but sugar for a(n associated) function. So functions, at least language-wise, need to be a more primitive construct on top of which one can get methods, traits/interfaces, and thus, closures.

But granted, at that point indeed there is indeed a bijection between a const closure and a function:

  • you can convert a function into a closure as showcased in that playground: dummy empty struct, whose callable interface simply calls the function;

  • you can convert a const closure into a function, by inlining the const definition inside the function's body, and calling it.

  • Capture-less / environment-less closures can be trivially made const thanks to a dummy empty struct.

So, in practice, the real difference between a closure and a function is:

A const can only refer to other const items (and within a function's body / method, to statics), whereas a let local binding can refer to all that as well as to other local bindings (c.f. a closure's "captures" / environment).

And I will reiterate that in a systems programming language, distinctions between compile-time and runtime are paramount, as well as any associated layout guarantees (such as fn pointers having the layout of a pointer, and fn items being zero-sized).

So, despite the theoretical interest of looking at the conceptual level, or as you put it:

if there can be implementation differences between two conceptually similar entities, then it is a systems programming language job to offer a way to pick among those different implementations, as with an fn item and a closure.

4 Likes

This isn't really true. All you need is for the language to know how to 'call' something, and that can be expressed in a completely independent model (e.g., jmp instructions.) There's no need for functions or closures to be more or less conceptually primitive than the other. You do not need one concept to understand the other, you just need to distinguish between code that is executing and code that is being treated as data, because then you will have the tools to understand/implement both independently.

Just as a historical note, I believe this syntax originates with Smalltalk, where it is used for all local variable declarations. Quoting the "Smalltalk fits on a postcard" syntax example from Wikipedia:

exampleWithNumber: x
    | y |
    true & false not & (nil isNil) ifFalse: [self halt].
    y := self size + super size.
    #($a #a 'a' 1 1.0)
        do: [ :each |
            Transcript show: (each class name);
                       show: ' '].
    ^x < y

This defines a method of some object (there are no named free functions in Smalltalk) that takes one argument named x and has a local variable named y. Square brackets enclose "blocks", which are not entirely unlike lambdas or closures, and are used for all control flow constructs; the block passed to the do: method takes one argument named each. I don't remember how you define variables local to a block, that aren't arguments, or even whether you can.

2 Likes

That part of the Rust syntax is probably the one I found a bit inconvenient. Reading the syntax is OK, but writing it is... uhh. I almost always automatically fall back writing the following syntax (having it done many times in other languages):

functionAcceptingCallbackWithNoArguments(() => {  /* do something */ });

functionAcceptingCallbackWithOneArgumentUsed((a) => {  /* do something with a */ });
// OR simply
functionAcceptingCallbackWithOneArgumentUsed(a => {  /* do something with a */ });

functionAcceptingCallbackWithOneArgumentUnused(_ => {  /* do something, but ignore the arg */ });
// OR
functionAcceptingCallbackWithOneArgumentUnused(() => {  /* do something, but ignore the arg */ });

functionAcceptingCallbackWithManyArgumentsUsed((a, b, c) => {  /* do something */ });

functionAcceptingCallbackWithManyArgumentsSomeUnused((a, b, _) => {  /* do something */ });
// OR simply:
functionAcceptingCallbackWithManyArgumentsSomeUnused((a) => {  /* do something */ });

functionAcceptingCallbackWithManyArgumentsAllUnused(_ => {  /* do something */ });
// OR simply:
functionAcceptingCallbackWithManyArgumentsAllUnused(() => {  /* do something */ });

Is it only me liking the above syntax?

Yeah, that's JavaScript's arrow function notation. It's not too bad, normally, but it does require arbitrary lookahead.

With Rust's current syntax, a pipe at the beginning of an expression or a statement indicates a closure expression. The only other case where a syntactic construct can start with a pipe, as far as I can remember, is in a match arm, where it indicates an empty OR, and closures can't appear in match arms so it's not a problem.

For the Rust compiler team itself, I don't think this is actually all that relevant, because token trees are already powerful enough to parse this without taking quadratic time (you need one token tree of lookahead), and rustc already needs this amount of lookahead to do diagnostics and macro expansion. But other tooling, like Sublime Text, will probably appreciate being able to recognize a closure without having to parse everything.

2 Likes

In the Coq language a function is written fun (x:T) => expr. I think it would have been nice if rust's were fun pattern => expr, as it would reuse syntax from the match blocks. But there is no point now in trying to change it.

The one thing I kind of miss is mixing closure introduction and a match pattern in one, like functional languages do:

iterator.map {
    Thing::A(a) => { … },
    Thing::B(b) => { … },
}

but that's just nitpicking, since a |it| match it { … } already achieves that:

iterator.map(|it| match it {
    Thing::A(a) => { … },
    Thing::B(b) => { … },
})

Well, FWIW the latter version does not allow for non-local control flow - i.e., no early exiting or returning.

There has been talk on such "TCP-preserving" closures, but AFAIK there's no clear way how to best implement them in Rust.

This makes me curious: Aside from not running some Drop implementations, would anything bad happen if you made FFI calls to C's setjmp and longjmp for this? Would it even work?

I don't think we need the whole assembly jumps machinery here, the language could unsugar to something like what ::with-locals does: