Inner functions not closed over their environment?


#1

It is my understanding that functions defined within functions are not closed over their lexical environment (unlike Scheme, for example), whereas anonymous functions (lambdas) so defined are closed. This seems like a very odd inconsistency in the language.

If my understanding is wrong, please set me straight. Otherwise, can someone explain the reason for this?

Thanks –
Don Allen


#2

Items inside functions are (so far as I know), exactly equivalent to the same thing written outside the function, except that you can’t name it from the outside. In other words, they behave as though they’re hoisted out of the containing function: for example, inner items (including functions, consts and other type definitions) can’t use generic parameters from the outer function.

As for why, I honestly don’t know. Might be a deliberate design choice, might be implementation convenience, might be just the way it happened. It’s not really a big deal, either way. I think, on balance, it’d be slightly more bothersome to have them capture like closures do.


#3

They’re functions, not closures, so they don’t close over anything. Functions don’t close, closures close. I see this as consistency, not inconsistency.


#4

You are simply repeating the way functions and closures are defined in your book and as they are implemented in Rust. Your point about consistency is that the language is consistent with itself, not terribly helpful.

My point is that other languages, such as Scheme, Python and even C (at least GNU C), do not make this distinction. Functions in those languages can reference free variables in their lexical scope. The point of my original post is that I find the Rust distinction odd – I don’t understand why this design choice was made.


#5

The fact that a fn does not close over an environment also allows it to have these properties:

  • A fn has an anonymous type that is zero-sized.
  • A fn can be coerced to a type like fn(T) -> U that is represented as a function pointer.

These representation details tend to matter more for Rust than for languages like Scheme and JavaScript that have fewer distinctions between functions and closures. They affect things like the generated code for functions that are generic over Fn traits.


#6

I’m not sure I completely understand your message, but it suggests to me that perhaps this was done as a result of the Rust approach to memory management, which is different from every other language that does not make the distinction between functions and closures?


#7

Lots of languages box every value, so they can ignore distinctions between functions that have associated data and ones that don’t, since the extra data is hidden behind a pointer. In Rust, there are things that you can do with a non-capturing function that you can’t do with a capturing function, so it makes sense to keep a distinction.


#8

Ok, this is beginning to make some sense and confirms my thought that this was somehow related to Rust’s approach to memory management. I can well believe that things become possible or practical in an environment (pun intended) where everything is in the heap, garbage-collected as necessary that don’t make sense in Rust.


#9

It’s not just in Rust; for example, look at wikipedia on closures

a closure is a record storing a function together with an environment

and

A closure—unlike a plain function—allows the function to access those captured variables through the closure’s copies of their values or references, even when the function is invoked outside their scope.

“Closures are functions + an environment” are the way I’ve seen them explained almost everywhere.

I was not aware C let you do this, the docs describe it as a GNU extension, which makes sense.


#10

I am not talking about academic definitions; I’m talking about real-world language implementations, the vast majority of which do not make the distinction. And especially don’t implement lambdas one way and named functions another.

As for C, even vanilla C, as described in K&R or Harbison and Steele, functions close over their environment, but in a very restricted sense. In vanilla C, the "environment’ is the top level, the globals. And, in a way, Rust does the same thing with its top-level functions, from which you can reference “statics”, Rust’s global variables (though unsafe if mutable, as you discuss in the book).

I think part of the answer to my original question is rooted in the fact that Rust is a different class of language than Scheme or Python. I tend to think of Rust as being a modern-day replacement for C, solving a lot of the problems in C without sacrificing much, if any, of C’s performance. Neither Scheme nor Python aspire to that. It’s also statically typed, which is not true of Scheme or Python. So my comparison of what is done in Rust to Scheme was somewhat apples to oranges.


#11

And it has this nice gem:

If you try to call the nested function through its address after the containing function exits, all hell breaks loose. If you try to call it after a containing scope level exits, and if it refers to some of the variables that are no longer in scope, you may be lucky, but it’s not wise to take the risk. If, however, the nested function does not refer to anything that has gone out of scope, you should be safe.

That’s the sort of red flag that Rust explicitly forbids, but closures might safely get away with this with if you can satisfy the borrow checker, especially using move ||.


#12

I’m not really either; even in industry, this is always how I’ve heard them talked about.

And to be clear, I’m not trying to have a fight here; I appreciate you bringing up a point of view that might lead to some confusion. It’s always better to be more clear, and we can always do more to improve the docs to make things more clear. I’m trying to dig in to figure out where the disagreement is here; it appears that we pretty much have had the exact opposite understanding of things, or at least, how they’ve been presented to us in the past.

I think that this might be some of it; for example, in Ruby, we have somewhere between four and seven types of closures depending on how you categorize them. (blocks, procs, lambdas, and methods) And they do have different syntax.


#13

I agree with you that our differing perspectives could well come from different experiences. I’ve written code in a lot of languages over 56 years (I wrote my first program in IBM 1620 assembly language in 1960; also some Fortran at that time). Much of the early part of my career was writing OS code when that kind of thing was written in assembly language (I worked on CP for the 360/67 and later ran the Tenex project at BBN, so wrote a lot of PDP-10 assembly code). But among the higher level languages, Lisp, and later, and particularly, Scheme, have always been favorites. I’ve felt for many years that Jerry Sussman and Guy Steele found a minimal basis set in Scheme – a set of principles that once learned, let you pretty much derive a piece of the language that you didn’t have at your fingertips. So I come to this from a very Scheme-y perspective.

You mention Ruby. While I know Python and TCL well, I don’t know Ruby (or Perl; I loved Paul Graham’s comment that Perl programs look like a cartoon character cursing). Apparently Ruby has influenced your thinking as Scheme has influenced mine. I’ve also written a fair amount of Haskell in recent years and admire a lot about that language and its implementation (the GHC compiler is a work of art in my opinion). I’ve also written a great deal of C over many years, and while it certainly has its warts, it is so well implemented these days that it has become eminently usable (and I speak from the perspective of someone who suffered through writing C code on an overloaded Vax 780 running 4.2 BSD 30+ years ago; back then, the C and Unix motto was “it was hard to build; it ought to be hard to use”). I’ve also written a tiny bit of C++ and consider the language an absolute abomination. There’s also some Pascal code in my background, and PL/1 and Fortran, though both are ancient history.

Maybe you didn’t want it, but that gives you an idea of the influences on how I view programming languages.


#14

On the other hand, Rust also distinguishes different types of closures: Fn, FnMut, `FnOnce, yet the syntax to define them is the same. When I learnt Rust I though it was the same things for functions and closures: I thought if you didn’t close on the environment, it was inferred to be a function, else it was a closure, and that the syntax difference between function and closure definition was just syntactic sugar. And I think it could work this way, actually (though I’m not persuaded it would be a good idea).


#15

That honestly seems unhelpful as a way to understand what closures are doing, even if there’s no technical reason not to consider globals to be an “environment”. Scheme and other functional languages seem like the right way to understand closures, but of course Rust is not itself a functional language.

So I think the key point is that if you consider Rust to be primarily a member of C-like low-level imperative languages, then the concept of closures is a special borrowed concept from another language family, so you shouldn’t expect it to work the same way normal functions work.

Perhaps the other way to look at it is to consider normal functions (i.e. the ones that are convertible to function pointers) to be a “borrowed” concept from imperative languages supporting function pointers.