Sorry if this is a duplicate, but I like the "keywords" I've showcased in this i…ssue. Other related issues:
- https://github.com/rust-lang/unsafe-code-guidelines/issues/285
- https://github.com/rust-lang/unsafe-code-guidelines/issues/287
- https://github.com/rust-lang/unsafe-code-guidelines/issues/256 <- this is a pervasive instance of that.
- https://github.com/rust-lang/unsafe-code-guidelines/issues/243
- https://github.com/rust-lang/unsafe-code-guidelines/issues/134 <- canonical issue?
- https://github.com/rust-lang/unsafe-code-guidelines/issues/52
Note that the quite loaded term "provenance" is being used here as described mainly in https://github.com/rust-lang/unsafe-code-guidelines/issues/134.
___
## Unleaking
The stdlib libs docs [currently state, regarding `Box::leak`](https://doc.rust-lang.org/1.57.0/std/boxed/struct.Box.html#method.leak):
> Dropping the returned reference \[return value of `Box::leak()`\] will cause a memory leak. If this is not acceptable, the reference should first be wrapped with the `Box::from_raw` function producing a `Box`. This `Box` can then be dropped which will properly destroy `T` and release the allocated memory.
So, even if there is no code snippet, such statement is stating that:
```rust
fn drop_box<T> (boxed: Box<T>)
{
drop(unsafe { Box::from_raw(Box::leak(boxed)) });
}
```
is sound, no matter the `alloc::Global` backing it.
- <details><summary>A far-fetched / contrived generalization to any <code>impl Allocator</code></summary>
```rust
fn drop_box_in<T, A : Allocator> (boxed: Box<T, A>)
{
let (mut ptr, alloc) = Box::into_raw_with_allocator(x);
unsafe {
ptr = &mut *ptr;
drop(Box::from_raw_in(ptr, alloc));
}
}
```
</details>
Which, given `Box`'s implementation, is assuming that if somebody asks an `impl GlobalAlloc` —or an `impl Alloc` if generalizing— memory for a `Layout::new::<T>()` (through `alloc` or `realloc`), and gets back a non-null pointer `ptr`, then it is then legal to give back `ptr` to that `impl Alloc`'s `dealloc` (or `realloc`), but with `ptr`'s provenance having been "shrunk" down to that `T`'s layout (_e.g._, through `ptr = <*mut _>::cast(&mut *ptr.cast::<MaybeUninit<T>>());`).
This, in practice, can be quite problematic for many (most?) `GlobalAlloc` implementations out there, since they do often perform some bookkeeping and whatnot laid out contiguously to the yielded allocation, and such metadata would thus not be accessible from such a returned pointer alone: the allocator would thus need to keep some extra data / state to be able to get back provenance over the user-facing allocation _and_ the contiguous metadata.
### A simplified example
```rust
use ::std::{
alloc::{self, GlobalAlloc as _, Layout},
mem::drop as stuff,
ptr,
};
unsafe trait AllocI32 {
fn alloc_i32() -> Option<ptr::NonNull<i32>>;
unsafe
fn dealloc(_: ptr::NonNull<i32>);
}
struct MyAllocUsingGlobalAllocUnderTheHood;
#[repr(C)]
struct I32AndMeta {
alloc: i32,
meta: u8,
}
static SYSTEM_ALLOC: alloc::System = alloc::System;
unsafe impl AllocI32 for MyAllocUsingGlobalAllocUnderTheHood {
fn alloc_i32() -> Option<ptr::NonNull<i32>> {
let layout = Layout::new::<I32AndMeta>();
ptr::NonNull::new(unsafe { SYSTEM_ALLOC.alloc_zeroed(layout) })
.map(ptr::NonNull::cast)
}
unsafe fn dealloc(ptr: ptr::NonNull<i32>) {
let layout = Layout::new::<I32AndMeta>();
let meta = ptr.as_ptr().cast::<u8>().add(4).read();
stuff(meta);
SYSTEM_ALLOC.dealloc(ptr.as_ptr().cast(), layout);
}
}
```
The interesting lines here are:
```rust
unsafe fn dealloc(ptr: ptr::NonNull<i32>) {
let layout = Layout::new::<I32AndMeta>();
let meta = ptr.as_ptr().cast::<u8>().add(4).read();
```
if `ptr` were to stem from `&i32` (_e.g._, `let r: &i32 = …; dealloc(r.into());`), even if that `&i32` had originated from an `alloc()`-yielded ptr (`let r: &i32 = alloc().unwrap().as_ref();`), then the operation reading `meta` would not be well-defined: `r.add(4)` would yield an off-by-one pointer, which would not be usable to perform a read-dereference with.
___
### The two possible workarounds
In a world without any abstraction whatsoever, the answer to this problem is easy: keep a pointer with provenance over that allocated `I32AndMeta` around (such as the `ptr` returned by `alloc` itself), and use it to "launder" the received ptr. But since there is this `Alloc` / `GlobalAlloc` boundary, the question remains: who should be responsible for doing this?
- Would it be the `Alloc`ator, as in:
```diff
unsafe fn dealloc(ptr: ptr::NonNull<i32>) {
let layout = Layout::new::<I32AndMeta>();
let ptr = ptr.as_ptr().cast::<u8>();
+ let ptr_with_provenance: *mut u8 = Self::get_ptr_with_provenance(…);
+ // amend `ptr` so that it points to the same place, but with a fixed provenance
+ let ptr = <*mut u8>::wrapping_offset(
+ ptr_with_provenance,
+ isize::wrapping_sub(
+ ptr.as_ptr() as _,
+ ptr_with_provenance as _,
+ ),
+ );
let meta = ptr.add(4).read(); // <- this read is now well-defined!
stuff(meta);
SYSTEM_ALLOC.dealloc(ptr.as_ptr().cast(), layout);
}
```
- or would it be the user of the `Alloc`ator, **by declaring "unleaking" to be a contract violation** / **by requiring that a pointer with the originally-obtained provenance be the only valid input for a `{de,re}alloc` call?**
```rust
pub struct MutRefWithOriginalProvenance<'lt, T> {
ptr: ptr::NonNull<T>,
_phantom: PhantomData<&'lt mut T>,
}
impl<T> Deref{,Mut} for … { type Target = T; … }
impl… MutRefWithOriginalProvenance… {
pub fn into_raw…
}
```
```rust
let mut r: MutRefWithOriginalProvenance<'static, i32> = Box::reversible_leak(Box::new(42));
*r += 27;
unsafe {
drop(Box::from_raw(r.into_raw()));
}
```
This point ought to be clarified, and if going for the latter —or until confirming the former—, then the stdlib docs should be updated to actually disincentivize unleaking.
___
#### My potentially-obvious two cents
It feels like the "legalized unleaking" approach has the drawback of requiring that extra `get_ptr_with_provenance(…)` operation, which could come with a non-negligible cost for _all_ `GlobalAlloc` implementations, only to allow a potentially deemed niche "unleaking" operation.
But it also feels like "forbidden unleaking" approach is quite a footgun, if, for instance, even the stdlib docs have gotten it wrong for such a long time.
So this seems like the classic "let's gauge/measure the performance benefits of 'forbidden unleaking' / the performance cost of 'legalized unleaking'" to compare them against the footguns that forbidding it introduces.
Finally, and this is technically beyond Rust's reach, there is also the question of non-Rust pervasive implementations of `GlobalAlloc`, such as that of `libc` (`malloc`, `calloc`, `realloc`, `free`) and whatnot. Such implementations do use metadata, and according to @chorman0773, the cost of a `get_ptr_with_provenance(…)` operation would be very much non-negligible (and, technically, even more so since Rust cannot go and tweak such an implementation, and would thus have to wrap it in a black-box API kind of fashion).
- So, _from the looks of it_ / IIUC, the only _practical_ approach _w.r.t._ a legalized unleaking would be to ban `malloc` _& friends_ from being used for `GlobalAlloc` implementations! But I may very well be wrong; I'll let @chorman0773 (and others) chime in and clarify this _hypothetical_ point (although if this were to be true, then I guess there is really no other choice than forbidding unleaking).
- <sub><sup>This, in turn, could be deemed inconvenient for FFI users which would have chosen a `malloc`-powered `#[global_allocator]` to allow `free`-ing to occur from the C side, but this is yet another topic…</sup></sub>