Why isn't Box::into_raw unsafe?

This question is just to satisfy my curiosity and help me better understand Rust's way of thinking. There is no programming problem behind it, I think I understand what Box::into_raw does.
But I was wondering why it is not declared to be unsafe. Calling it without later calling Box::from_raw (or doing something similar) creates a memory leak - the data inside the box is never properly dropped.
If someone can explain the rationale, I would greatly appreciate that.

Memory leaks and creation of raw pointers is not considered unsafe.

Memory leaks are not unsafe because it's virtually impossible to avoid safe code from causing memory leaks. The classic example is a cycle of reference counted objects.

Creation of pointers is not unsafe because the unsafe part is using them.

3 Likes

That makes senes - thanks for the explanation!

Also memory leaks are not considered
unsafe because… well, they aren't. What kind of intrinsic unsafety can some allocated memory, that is otherwise used correctly, cause? It won't cause corruption of memory, undefined behavior, data races… (of course it's annoying and might even have security implications, but that is outside the scope of unsafe and the kind of memory safety Rust commits to.)

1 Like

Well we have in the past wanted to consider it unsafe because no memory leaks is a nice guarantee to have, but it just turned out to be impossible to make that guarantee.

But my point is that "no memory leaks is nice to have" is not the same as "memory leaks are unsafe". Memory leaks aren't unsafe. Granted, they can be logical bugs, but regardless of whether someone wanted to consider them unsafe because of some guarantees, the are not memory-unsafe as defined by Rust.

Sure, but what Rust defines as memory-unsafe had to be decided at some point in the past.

3 Likes

It's only obvious in hindsight because the definition of unsafe has been crystalized and well established, in part because of discussions around memory leaks. The catch_unwind API even had some proponents in favor of declaring it unsafe, even though exception safety is different from memory safety, because they saw unsafe as a roadblock to using the API. Folks generally agreed with the motivation, but even by then, unsafe had crystalized to being for memory safety only. This is how the UnwindSafe traits were born.

(I'm going on my memory here, without carefully reviewing these discussions.)

I want to put a finer point on this: most code does not like, that is, it is not virtually impossible to write safe code that doesn't leak. What is virtually impossible is stopping someone who wants to create a leak using only safe code, because of a variety of things (like, the refcount cycle you mention)

Back before the "leakpocalypse," Rust used to define leaks as UB. During the rush to the Rust 1.0 release, it was discovered how easily reference cycles could be created in safe code. In the aftermath, std::mem::forget was marked safe, and tragically, we had to give up std::thread::scoped, which relied on freedom from leaks for memory safety.

4 Likes

In this case, I'm very glad leaks have (eventually) been un-undefined. In fact, coming from a background whereby I've implemented smaller and larger safe dynamically-typed languages, where leaks due to reference counting GC are possible, I'd pretty much expect leaks to be a "logical" error (for the lack of a better word) and saying that they are undefined would seem almost nonsensical to me. (On a related note: one doesn't even have to reach into the realm of threads to create a leak, since Rust too has Rc – I'm not sure if it was already there at the time the discussion you cited took place, though.)

By the way, I certainly appreciate that there are APIs of which the implementation would rely on leak-freedom for safety; however, in my opinion, leak-freedom is not special in this regard. I suspect thread::scoped() contained other bits of unsafe code as well, didn't it? Because in this case I view not leaking as just another invariant that needs to be maintained (or its absence accounted for), as it is always the case with unsafe code.

To draw a parallel, let's have a look at one of my favorite examples of caveats when working with unsafe: the effectful AsRef impl.

use std::cell::Cell;

fn print_bytes_unsafely<T: AsRef<[u8]>>(bytes: &T) {
    let ptr = bytes.as_ref().as_ptr();
    let len = bytes.as_ref().len();

    for i in 0..len as isize {
        let byte = unsafe { *ptr.offset(i) };
        println!(" {:02x}", byte);
    }
}

static GLOBAL_ARRAY: [u8; 128] = [0; 128];

struct Evil {
    array: [u8; 1],
    flag: Cell<bool>,
}

impl AsRef<[u8]> for Evil {
    fn as_ref(&self) -> &[u8] {
        match self.flag.replace(true) {
            false => &self.array,
            true => &GLOBAL_ARRAY,
        }
    }
}

fn main() {
    let evil = Evil {
        array: [42],
        flag: Cell::new(false),
    };
    print_bytes_unsafely(&evil);
}

Here, Evil has a non-const impl for AsRef, more precisely, it returns a slice into the contained array on the first invocation but a slice into a global, longer array upon the second and subsequent invocations. It is entirely safe code but it breaks the unsafe code, driving it into undefined behavior, because said unsafe code failed to account for such absurd implementations of AsRef.

To me, this means that the bug is distributed: AsRef is not supposed to be effectful, and if I ever encountered someone writing this implementation in practice, I'd not be nice to them, but the unsafe code is not considerate enough either. However, while the bug might be distributed, the unsafety and the blame isn't; if we are being strict, we could say that only the unsafe code is at fault for not being defensive against potentially evil (or in practice, just tricky/wrong/buggy) safe code.

Just as we wouldn't reasonably write such an AsRef impl, we wouldn't usually write memory leaks on purpose, either… thus, they are comparable from the point of view of the corresponding bits of unsafe code (the unsafe print function and thread::scoped, respectively).

I guess this might be a philosophical question at this point, because definitions of "safety" are arbitrary. However, I think leaks being defined matches general intuition way better.

3 Likes

Since this is pretty philosophical, I think the many possible definitions of "leak" are just as important to bring up as the possible definitions of "safe". The deep yet simple reason why "memory leaks" in the broadest sense of the term are safe is that a "leak" is a matter of intent. It's allocating memory you didn't intend to use, or failing to free it after you no longer intend to use it anymore. While a programming language can do a lot to encourage fewer leaks in practice (like Drop and the unused-* lints), it'll obviously always be possible to create a large object and then do just enough with it to defeat the lints, but not enough to justify its largeness.

But there's also a popular narrower sense, along the lines of "all memory gets freed by the end of the scope it was allocated in". That's what the leakpocalypse talk was all about, and what most of the other posts in this thread seem to be using. We probably could have defined that notion of "scope leak" precisely enough to make it a guaranteed unsafe-only thing, but that would've only been possible by forbidding any reference-counted APIs in safe Rust. So the answer to "why are scope leaks not unsafe?" is in a nutshell "we decided that making APIs like Rc and Arc usable in safe Rust was the bigger win, and you can't have both."

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.