Why does Rust use pipes (`|`) to delimit closure parameters?

In Rust, the pipe character (|) is used to delimit closure parameters.

Such as:

|x| x + 1

I've always been wondering; why has | been chosen for this, in contrast to ( & )? Would the use of brackets introduce ambiguity in certain situations?

I'm sure there's a clear reason for this, but I could not find the answer to this in the Rust book, nor did I find an RFC describing this.

4 Likes

The syntax is borrowed from Ruby, I think. It's been there for a very long time, before the 1.0.0 release and the stabilization of many things, and removal of many others, and the whole RFC process wasn't as polished and present as it is now (it was the experimental stage of the language).

Now, if we ignore the exact choice to syntactically mark a closure (e.g., the Ruby pipes, here), the fact that closures and functions use a different syntax is paramount in a low-level / "system" language such as Rust:

  • the implementation, machine-wise, of a free function is quite simple and the building block of many other things: machines instructions are stored in some static part of memory (unless we are dealing with dynamically loaded libraries...), and calling / referring to a function is thus achieved by using an address to the beginning of that chunk of memory (plus some ABI considerations).

  • a stateful closure, on the other hand, is a fully fledged "object": there is a "struct" with some captured state, and it happens to have a method whose parameters are the closure parameters and this object.

    The way this object, and its state, is stored (heap, stack?) and used / referred to (using indirection? borrowed? inlined / by value?) are tweaks that a system programming language should not hide from the programmer (compare the simplicity of fn(), vs. {&,&mut,Box,Rc,Arc} impl/dyn Fn{,Mut,Once}() + 'lifetime [+ Send] [+ Sync] [+ Clone]), and so whether you are trying to define such a closure or a basic function is a very meaningful distinction, one that syntax ought to express.

The fact we have this distinction is what allows programmers, especially those using unsafe code, to be confident about the semantics of their program the moment they start doing more subtle things that the compiler may not understand (those who've had to use long-lived callbacks across FFI / native modules with a garbage-collected language know the many pitfalls from the language runtime / GC doing stuff under the rug (C#: pure, unmarshalled, function pointers when? :weary:) ...)

It is also what allows getting very nice error messages when one gets the definition of a function wrong:

// Tweaking your example:
let one = 1;
fn  add_one_v1   (x: u32) -> u32 { x + one }

yields:

error[E0434]: can't capture dynamic environment in a fn item
 --> src/main.rs:4:40
  |
4 | fn  add_one_v1   (x: u32) -> u32 { x + one }
  |                                        ^^^
  |
  = help: use the `|| { ... }` closure form instead

In a way, the fn _ (...) -> _ { ... } vs. move? |_| -> _ { ... } distinction is very similar to the const vs. let one :slightly_smiling_face:

17 Likes

I've always been wondering; why has | been chosen for this, in
contrast to ( & )? Would the use of brackets introduce ambiguity
in certain situations? Is it because closures provide type inference?

Really good question. I also want to understand this.

In a functional programming language like elisp we can pass around normal
functions. There is no need to create a separate type.

In C, and its derivatives as well, we could pass/assign function
pointers.

ObjC and C++ have their own distinct lambda/block syntax. In these languages the syntax decisions were dictated by backwards compatibility, so it is what it is, because all the good syntax has already been taken :slight_smile:

In Rust's case using () for closures would make syntax ambiguous:

let wat = (arg) *arg;

Is that a closure that dereferences its argument, or is it an expression that multiplies a variable? C has a similar ambiguity with typedefs, and it's not fun to parse.

8 Likes

I guess you can summarize this with these points:

  • || {} is borrowed from Ruby
  • implemented in early stages of Rust
  • () {} in itself introduces ambiguity (example)
  • a closure is fundamentally different than a function
  • || {} gives a clear visual distinction, useful in complex code

@Yandros thanks for your comprehensive response!

@kornel thanks for your example on ambiguity, I couldn't come up with one myself.

13 Likes

Taking those two examples can help clarify why Rust makes a distinction:

  • Regarding C, it does not have closures. It only has (function names, and) function pointers. If you want to have a closure, you need to manually hand-roll one out of these basic building blocks, which is traditionally done with an added untyped data pointer (void * data, <ret> (*)(void * data, <closure_args...>). Lack of generics thus leads to loss of type safety, and we also don't know, modulo reading some documentation that clarifies that, for how long does the void * data pointer need to be valid, or how it should be disposed of.

    There are compiler extensions which either do hacky stuff under the rug to bundle a stateful closure inside a function pointer ("GCC's nested functions"), or other extensions which do use a different syntax (one that was available), to define what they call blocks, which is a functionality that is present in Objective C by default.

    So I hope I have managed to illustrate that in low-level / system programming languages, functions β‰  closures (the latter being a more complex entity than the former).

  • In (ergonomic) functional languages, i.e., high-level languages (β‰  system programming languages), from the premise of "everything can be done with 'functions' " / "(stateful) functions are values", etc., the choice was made to indeed not make a distinction. In a provocative way, we could say that a functional language does not have functions! It only has closures / lambdas (in practice, however, obviously, if the closure is stateless it will be implemented as a function), hence the lack of syntactic distinction.

7 Likes

I would actually have expected a closure to be an unnamed function and thus look like a normal function, but without, well, a name:

let wat = fn (arg) { *arg };

This is similar to how anonymous functions look in JavaScript. It is also very similar to how C++ lambda expressions look. In both of those languages, I find the closure syntax better aligned with the normal function syntax.

I find this syntax more consistent with the syntax for declaring normal functions β€” if it was discussed,I would be curious to hear why it wasn't used?

3 Likes

But closures are not normal functions, whether named or otherwise, because closures have access to their environment. So conflating closures with functions is semantically incorrect.

3 Likes

Functions in JavaScript really are just closures with names. This works:

function generate(a) {
  function frob() {
    return a;
  }
  return frob;
}
assert(generate(1)() === 1);

Analogous code in Rust produces a compiler error.

fn generate(a: i32) {
  fn frob() {
    return a; //~ ERROR: cannot access environment in nested function
  }
  return frob;
}
assert!(generate(1)() == 1);
2 Likes

If I had been around back pre-1.0 with my current understanding of the language, I would have pushed hard for fn(args) to be the closure syntax, but I wasn't and I didn't. We have |args| as the closure syntax and it won't be changing :slight_smile:.

1 Like

I come from Python and JavaScript, so I'm not used to making this distinction between functions and closures.

I mean, a top-level function also have access to top-level values such as constants, no? So that feels like their environment to me, but there is probably some subtle difference that I miss?

Indeed! This makes me see closures as the more primitive building block from which you can create functions (by assigning a name to your closure). However, I guess this is much easier in a garbage collected language like Python, JavaScript or Go?

On the other, Rust does allow me to return a closure, just like Go does:

  • Rust playground:

    fn adder() -> impl FnMut(i32) -> i32 {
        let mut sum = 0;
        return move |x| {
            sum += x;
            sum
        };
    }
    
  • Go tour:

    func adder() func(int) int {
    	sum := 0
    	return func(x int) int {
    		sum += x
    		return sum
    	}
    }
    

Can I tell the difference between a closure and a "real" function in my code? That is, can there be a behavioral difference between code which uses a closure

let mut pos = adder();

// call pos(i) several times.

and code which uses a function:

fn pos(i: i32) -> i32 {
    // ...
}

// call pos(i) several times

I guess the pos function must keep state in a mutable global variable, and I guess this implies that I should be able to change this variable from outside of pos. In the closure case, the state is completely internal to the closure and I have no way of getting access to it. So that might be one difference?

1 Like

Closures carry their captured environment around with them to track state; that data may be created at run time (e.g. create many versions of a capturing closure in a loop), and isn't global. Maybe think of it like a struct you create inside-out by defining a method, which the compiler uses to infer the data structure.

Closures are anonymous types (you can't name the type), and each one has unique type; if you want to pass them around, you have to rely on traits (nit: unless you captured nothing at all). They may or may not be clone-able, copy-able, etc. More detail can be found in the reference.

Functions (function pointers) do have concrete, nameable types (e.g. fn(usize) -> f32). It's just a pointer, it's not carrying around state with it. Any global state it references lives in a global part of the program. A chapter in the book discusses them; one difference it points out is that you can pass a function pointer to C, but C isn't going to understand a closure.

10 Likes

It still could have been let foo = fn (args) -> Ty { retval }, but that's somewhat noisier.

For closures the terseness of the syntax matters. In JS, function() {} syntax was making callback-based APIs verbose, and ES6 added x => x syntax. It was added despite a high cost of requiring unbounded lookahead to parse.

5 Likes

In Rust the fn() type means a thin pointer. It'd be odd if fn() expressions were function types that are incompatible with fn() types.

1 Like

Yeah, but it's also kind of weird that || expressions, which are "normally" closures, can be implicitly coerced to the relevant fn type if they happen to not close over anything, so I can see where @ekuber is coming from.

1 Like

Individual functions also have a unique, unnameable, zero-size type, but the value can be coerced to the corresponding function pointer type with a normal pointer size.

2 Likes

Yes, agreed – closures end up being used all the time once the language allows them, so it's nice if they are as lightweight as possible.

1 Like

This compiles

This does not

fn adder2() -> impl FnMut(i32) -> i32 {
    let sum = 0;
    fn inner(x: i32) -> i32 {
        sum + x
    }
    
    inner
}
error[E0434]: can't capture dynamic environment in a fn item
  --> src/lib.rs:12:9
   |
12 |         sum + x
   |         ^^^
   |
   = help: use the `|| { ... }` closure form instead

Perhaps I'm missing a point here but I always though "fn" was Rust speak for "function".

A "function" mathematically and according to those FP guys take parameters and always produces the same result output for a given parameter(s) inputs. The output does not depend on any global or not so global state outside of itself, not the time of day or any other I/O.

From this I conclude a "subroutine" (as in BASIC) is not a function. "procedures" (Pascal) and even "functions" in other languages are not really functions at all. "methods" in an OOP like way are not functions.

I further conclude that closures in Rust are not functions and therefore the "fn" syntax would not be appropriate.

If that chain of thought makes any sense then the question is what syntax is appropriate for closures?

The pipe thing seems as good as any to me.

1 Like