# RFC 2229 Stabilization Report
## Links
* [RFC 2229](https://github.com/r…ust-lang/rfcs/blob/master/text/2229-capture-disjoint-fields.md)
* [Tracking issue](https://github.com/rust-lang/rust/issues/53488) and [project repository](https://github.com/rust-lang/project-rfc-2229/)
* [Reference guide PR](https://github.com/rust-lang/reference/pull/1059)
## Impl blockers
* [x] Fixing various ICEs and small bugs that arose from public testing period
* [x] Fixing behavior of `#[nonexhaustive]` captures across crates
## Summary
This feature adjusts the rules for closure captures starting in Rust 2021. In Rust 2018 and before, closures always captured entire local variables. Under these new rules, they capture more precise paths. These paths are always some prefix of the precise paths that are references within the closure body.
As a simple example, in Rust 2018, `|| println!("{}", a.y)` captures a reference to `a` and not just `a.y`. Capturing `a` in its entirety prevents mutation or moves from other fields of `a`, so that code like this does not compile:
```rust,ignore
let a = SomeStruct::new();
drop(a.x); // Move out of one field of the struct
println!("{}", a.y); // Ok: Still use another field of the struct
let c = || println!("{}", a.y); // Error: Tries to capture all of `a`
c();
```
In Rust 2021, however, this closure would only capture a reference to `a.y`, and hence the code above would compile.
Disjoint capture was proposed as part of [RFC 2229](https://github.com/rust-lang/rfcs/blob/master/text/2229-capture-disjoint-fields.md) and the RFC contains details about the motivation.
### Capture algorithm, guide description
#### Ref closures
Ref closures generally capture the precise paths that you use within the closure, subject to a few limitations:
* *Unsafe* parts of paths, such as the deref of a raw pointer or an access to the field of a union, always occur in the closure. The closure will therefore capture the prefix of the path that stops right before the unsafe operation (e.g., it would capture the raw pointer, not the data the raw pointer points at).
* When *moving* particular paths, we only capture data that was present on the stack frame of the closure creator. So if a closure moves data from `(*b).f`, where `b: Box<SomeType>`, we would capture all of `b`. This ensures that closures never capture deceptively large amounts of data that were intentionally being passed via a box.
#### Move closures
Move closures generally try to take ownership of all their captures. However, in the event that the capture they are taking ownership of is borrowed data, they may borrow the data instead:
* When the closure only uses a path by reference, and the path passes through a reference, then the path is captured by reference.
* e.g., a call like `x.0.split(0)` where `x: &([u32], String)` will capture `x.0` by reference, even though this is a move closure (note that in Rust 2018, we would have captured the reference `x` itself, which is ery similar).
* Otherwise, all paths are **moved**, but as above we only capture data that was present on the stack frame of the closure creator. So if a closure moves data from `(*b).f`, where `b: Box<SomeType>`, we would capture all of `b`. This ensures that closures never capture deceptively large amounts of data that were intentionally being passed via a box.
#### Optimizations
In addition to the rules above, the compiler also performs some optimizations to reduce closure size, but these do not affect which programs compile or do not compile, or the order of destructor execution, so they are not relevant to understanding how the program works.
#### Examples
```rust
struct Foo { x: i32 }
fn box_mut() {
let bx = Box::new([0; 1024]);
let c = move || println!("{:?}", *bx);
// Closure captures `bx`, even though the body uses `*bx`,
// because of the rule that we always capture data that resides
// on the creator's stack frame.
}
```
```rust
struct Foo { x: i32 }
fn box_mut() {
let mut s = Foo { x: 0 } ;
let px = &mut s;
let bx = Box::new(px);
let c = move || bx.x += 10;
// Closure captures `&mut (*(*bx)).x`, even though this is a move
// closure, because of the rule that "moves" of borrowed data
// ultimately cature by mutable reference.
}
```
### Capture algorithm, precise description
The precise capture algorithm is [described here](https://github.com/sexxi-goose/reference/blob/closure-doc/src/types/closure.md#overall-capture-analysis-algorithm). The high-level idea is as follows:
1. Collect the set C of all "captures" that appear in the closure body. A capture consists of a place P that is referenced and a mode M:
* The *place P* is something like `(*a).b.c`
* The *mode M* is `move`, `ref`, or `ref mut`
* For example, a closure like `|| x.0.truncate(0)`, where `x: &mut (String, String)`, would capture the place `(*x).0` with the mode `ref mut`.
* This is because `x.0.truncate(0)` is shorthand for `str::truncate(&mut (*x).0, 0)`
* Note that the mode M is independent of whether this is a `move` closure or not
2. Truncate the "captures" (P, M) in the C by applying various transforms:
* `unsafe_check(P, M) -> (P, M)` truncates the place P so that any unsafe operations occur in the closure, not the creator
* e.g., instead of capturing `(*x.f).m`, if `x.f: *mut T`, we would just capture `x.f`
* `ref_opt(P, M) -> (P, M)` truncates dereferences of `&T` references so that we capture the entire `&T` instead of capturing some subfield. This is an optimization that minimizes the size of closures without affecting the set of things that compile.
* e.g., given `x: &(String, String)` and `|| foo(&x.0, &x.1)`, we would minimize the places `(*x).0` and `(*x).1` to just `x`, thus capturing a single reference instead of two
* For a non-move closure:
* `ref_xform`: If this is a "by-value" mode capture, then truncate to the first pointer dereference
* e.g., given `b: Box<[u8; 1024]>` and a by-value capture of `*b`, truncate at `b` so that we capture the box, not its contents
* For a move closure:
* `move_xform`, which performs 3 truncations:
* If this is a `ref mut` mode capture, and the place contains a deref of an `&mut` reference, then leave it unchanged.
* If this is a `ref` mode capture, and the place contains a deref of an `&` reference, then leave it unchanged.
* Else, change to a a `by-value` mode capture, then truncate to the first pointer dereference
3. Minimization: If the set C contains two places (P1, M1) and (P2, M2) where P1 is a prefix of P2, then just capture (P1, max(M1, M2))
* e.g., given `|| { &x; &mut x.y; }` we capture `x` (the prefix of `x` and `x.y`) but with the mode `ref mut` (the greater of the two modes).
#### What data does a closure use?
Closures are considered to reference data under the following conditions:
* Shared borrows of a place (`&a.b.c`) is a `ref` mode access to the place `a.b.c`; shared borrows are often introduced via calls to methods like `a.b.c.split(0)`
* Mutable borrows of a place (`&mut a.b.c`) is a `ref mut` mode access to the place `a.b.c`; mutable borrows are often introduced via calls to methods like `a.b.c.truncate(0)`
* By-value accesses to a place (`drop(x)`) of `Copy` type are considered mode `ref` of the place `x`
* By-value of a place (`drop(x)`) not of `Copy` type are considered a `move` mode access to the place `x`
* Pattern matching against an enum variant `Enum::Variant` is considered a *read of the enum* if:
* the enum has multiple variants
* or the enum is marked as `#[non_exhaustive]` and is defined in another crate (FIXME)
### Implementation strategy
The implementation strategy is to modify the desugaring of closures. Whereas Rust 2018 closures like `|| foo(&mut a.b, &mut c.d, &mut c.e)` desugars to creating an anonymous struct with two fields, one for `a` and one for `c`:
```rust
ClosureStruct { a: &mut a, c: &mut c }
```
The Rust 2021 struct would contain a field for each unique capture:
```rust
ClosureStruct { a_b: &mut a.b, c_d: &mut c.d, c_e: &mut c.e }
```
This change is focused on MIR construction and does not affect the borrow checker in any significant way (there were changes to the diagnostics code).
### Size measurements
We have measured the size of closures on the compiler and other code bases. A broader crater run is in progress. Preliminary measurements suggest that closures are slightly larger in Rust 2021, but not substantially: the average is that closures grow by less than 1 byte. This table breaks down measurements by the size of the original closure:
| Rust 2018 min size | Rust 2018 max size | Number of closures | Average size in 2018 | Average growth in 2021 in bytes | Percentage growth |
| --- | --- | --- | --- | --- | --- |
| 0 | 16 | 17039 | 3.88 | 0.22 | 5.56%
| 16 | 32 | 4105 | 17.90 | 0.76 | 4.26%
| 32 | 64 | 1700 | 40.79 | 0.86 | 2.11%
| 64 | 9999999 | 850 | 99.40 | -0.26 | -0.27% |
| 0 | 9999999 | 23694 | 12.39 | 0.34 | 2.74% |
## How the RFC's unresolved questions were resolved
The RFC posed the following unresolved questions:
> How to optimize pointers. Can borrows that all reference parts of the same
> object be stored as a single pointer? How should this optimization be
> implemented (e.g. a special `repr`, refinement typing)?
We use a naive algorithm apart from a single optimization which captures an `&` reference instead of creating shared references to the individual fields. Our measurements found that this optimization is sufficient to ensure that closures are generally the same size or smaller in Rust 2018 and avoided any complex changes to the borrow checker.
> How to signal that a function is pure. Is this even needed/wanted? Any other
> places where the language could benefit?
We do not signal that functions are pure, which means that overloaded derefs and deref-mut implementations capture the base pointer. The one exception is `Box`, as described in the next question.
> Should `Box` be special?
Yes, we treat `Box` specially. This is consistent with the borrow checker which genreally treats `Box` as special. This could in the future be generalized with some sort of "deref pure" trait but we left that for future work.
> Drop order can change as a result of this RFC, is this a real stability
> problem? How should this be resolved?
This was resolved via edition migrations. Migratons were also used to address some other cases that arose, such as the fact that closures may not implement different auto traits as a result of capturing different things.
## Test cases and testing
* Public testing of the 2021 Edition and migration
* Unit tests in the [2229_closure_analysis](https://github.com/rust-lang/rust/tree/30a0a9b694cde95cbab863f7ef4d554f0f46b606/src/test/ui/closures/2229_closure_analysis) directory:
* The internal `#[rustc_capture_analysis]` annotation causes rustc to dump debugging output about what is captured and why, allowing us to test the model directly.
* For example, [this test](https://github.com/rust-lang/rust/blob/30a0a9b694cde95cbab863f7ef4d554f0f46b606/src/test/ui/closures/2229_closure_analysis/optimization/edge_case.rs) covers an edge case around the `&T` capture optimization, and the [stderr file](https://github.com/rust-lang/rust/blob/30a0a9b694cde95cbab863f7ef4d554f0f46b606/src/test/ui/closures/2229_closure_analysis/optimization/edge_case.stderr#L28) identifies the precise path that is captured.