Is it safe to drop_in_place a pointer to a dangling ref?

It 100% causes undefined behavior every time. They are the same thing. UB == instant undefined behavior. I don't know how to state it any clearer.

"Does what you hoped with the current compilation" is a subset of undefined behavior, because undefined behavior can do anything if compilation succeeds.

What you probably actually mean is "causes a miscompilation from my POV (i.e. doesn't do what you hoped with the current compilation)". That is not the same thing as undefined behavior. You 100% have undefined behavior. UB and undefined behavior are the same thing. Stop calling the "doesn't do what you hoped" thing "undefined behavior". You need a different terminology if you're going to be able to communicate effectively.

[1]

You think you do, as thousands before have as well. Time may prove you wrong.

There are no guarantees about what will "never really happen". That's the point. There's a long history of more UB things being exhibited as compilers become more agressive over time.

People who thought they knew better are the cause of untold numbers of bugs and fallout, with more every day. Changing that paradigm is one of the major motivations and benefits of Rust.


Alright, now let's say you don't give a hoot about any of that, and you're just looking for an opinion of what you can "get away with" with the current compiler (on your platform with your flags yadda yadda).

For any personal project or exploration,that's fine, knock yourself out! [2] But you'll have to be very upfront with your stance on forums like this, and probably won't find much help.

I think this is the best I've said it:

I get it, but I think you're going to keep getting friction, for at least two distinct reasons:

  • One is cultural -- preferring "definitely sound" to "happens to work today" is pretty entwined with the general safety guarantees of Rust, which are highly valued. Seems likely that you'll need to get past this part of the conversation every time. (This also leads to people not wanting "quick-and-dirty like C" to be common in Rust, as that would weaken it's "safer and more reliable" associations.)

  • Conflation of terminology: you led with "Is it safe", and in a Rust context, this is almost always going to be taken as "is it sound". Similarly, UB is usually not considered a what-it-compiled-to term, like was already hashed-over above.

The latter could be mitigated by being more explicit or using different terminology. But you'll also probably never get a very satisfying answer, due to the former cultural considerations -- people are a lot less interested in "getting away with" unsoundness compared to C, and thus don't know or necessarily even care when you can do it, because "you shouldn't".

Now, if you want to expand this attitude to some publicly used crate, you'll likely experience backlash and rejection. You're swimming strongly against the cultural tide.


  1. Incidentally this is also all true in C, there's just a large ignorance/sea of denial about C switching to an abstract machine model long ago, widespread use of compiler extensions to redefine what's UB thus creating different C dialects, a cultural acceptance/conflation between "it compiled ok" and UB, etc. ↩︎

  2. IMO, some will still deride you. ↩︎

6 Likes

Undefined behavior, abbreviated as UB, is all too often misunderstood. When we say that a program with UB may do anything, we are not suggesting that our computers are omnipotent. Most likely, your computer do not have the kind of hardware needed to make nasal demons.

What we do mean by "may do anything" is that in terms of reliability, you don't need rustc or any such compilers to "compile" an invalid program. At that point, you may as well use the one that's installed on most computers by default, usually on /dev/random. Come on, it works on my machine!

…And that's why it's wrong. You probably want a compiler that's more reliable than the infinite monkeys. That's what rustc does, with a condition that you do not feed it UB.

1 Like

You need to distinguish here. If a call is UB and you never do the call, that's sound.

if false {
    drop_in_place(p);
}

This is always okay to do.

Assuming okay_to_drop_p is true only if p is valid and aligned and okay to be dropped, then the following would be sound too (also assuming you later don't read from the pointer anymore, etc.):

if okay_to_drop_p {
    drop_in_place(p);
}

This doesn't mean p always needs to be valid. If you ensure to execute drop_in_place only in those cases where it is not UB, it's fine.


Interestingly, if you execute it in cases where you must not execute it (i.e. where the safety requirements are not met), the effects of the caused UB may appear before the if clause or anywhere else in the program. In this sense, the effects of UB may travel back in time.

But it's still sound to have drop_in_place(p) in your code, as long as it's only getting executed when its safety requirements are met.


Also note that if a call is UB and you make this call, it doesn't guarantee anything except that the behavior of your program is "undefined". Nothing else is or can be guaranteed anymore, so it's not even guaranteed you see an error. That's what makes UB so dangerous.


The following is sound too (but not safe, as it requires unsafe to use):

/// # Safety
///
/// Only call when blah and blah are met and everything goes fine.
unsafe fn foo(p: *mut str) {
    drop_in_place(p);
}

That is because the call that may cause UB cannot be triggered by safe Rust.

But this is unsound, no matter what:

fn badfoo(p: *mut str) {
    unsafe {
        drop_in_place(p);
    }
}

If you never call badfoo, your program doesn't cause UB, but badfoo is still unsound.

2 Likes

From the may_dangle RFC:

When used on a type, e.g. #[may_dangle] T, the programmer is asserting the only uses of values of that type will be to move or drop them. Thus, no fields will be accessed nor methods called on values of such a type (apart from any access performed by the destructor for the type when the values are dropped).

Which seems to be what you want, since the only thing you're doing with the &str is dropping it. Except that for lifetimes the RFC is a bit less clear (probably because generic types were the major usecase):

When used on a lifetime, e.g. #[may_dangle] 'a, the programmer is asserting that no data behind a reference of lifetime 'a will be accessed by the destructor.

I think this needs a bit of clarification, but I would say it is safe to assume that the intention is for #[may_dangle] 'a to behave as if your &'a str was a T and you used #[may_dangle] T, so it should be fine.

The problem is here T itself is UB if *T is UB
I am almost persuaded, well, more thinking is needed.

So, there are two conceptions. Soundness and UB.
Soundness is a conception on definition site and UB is a conception on calling site.

  • A call which does not satisfies safeness requirement to an unsound function will cause undefined behavior, including the unexpected behavior which can be expected by people.

    A example of that is this: Rust Playground
    Which can be expected no bad things can happen. But drop is still unsound because it does not forbids dangling happens and declare itself to be a safe function.

  • A call which satisfies safeness requirement to an unsound function will not cause undefined behavior.

    Given there is only one unsound function definition, this means the unsound function will compiles like C function, e.g. existence of potential of unsatisfied requirement will not release the compiler to freely do any thing it wants --- it has to ensure cases of safeness requirement been satisfied run as expected, that gives people the ability to predict, e.g. the unexpected behavior can be expected by people even the requirement does not upholds .

That is too general, consider the particular code

struct MyBox<'a> {
    ptr: std::ptr::NonNull<&'a str>,
}

unsafe impl<#[may_dangle] 'a> Drop for MyBox<'a> {
    fn drop(&mut self) {
        let p = self.ptr.as_ptr();
        unsafe {
            std::ptr::drop_in_place(p);
        }
        
        let layout = std::alloc::Layout::new::<&str>();
        unsafe {
            std::alloc::dealloc(p as *mut _, layout);
        }
    }
}

The **&&str does not exist, and there is no such possibility for a user to write out such code. The range is limited in fn drop(&mut self).

If there is something dangerous, it is in the body of the function. Then the question is: is there something dangerous in this function?

Without this context, surely there are unlimited possibility to cause a bad result, However, I can argue that I obey all the security rules strictly, except the particular drop, then question turns back again: all other problems you pointed out gone because I obey the security rules.

The phrase

This is semantically equivalent to calling ptr::read and discarding the result

from the drop_in_place docs is interesting. Creating an invalid &-reference is UB even if you discard it.

@zylthinking1: I think what you wrote above isn't correct. Let me try to elaborate.

Yes, they are not the same.

Soundness is a property of code (irregardless of whether you execute it or not). A piece of code is unsound if there exists a way to cause UB by invoking that code from safe Rust. The property of soundness doesn't depend on whether there actually exists some invocation like that or not. The mere possibility that you could trigger UB from safe Rust code makes something unsound.

Consequently, safe Rust code is always sound if it doesn't use any unsound, unsafe code (assuming there is no error in the theoretical foundations of Rust). This is because safe Rust can never trigger UB if there is no unsafe code used or if all unsafe code is sound.

Well, UB is something that can exist for a particular execution path of a program. Note, however, that:

  • It doesn't require a function call. Sometimes already creating a reference may be UB, see Playground and click on "Tools", "Miri".
  • If the execution path of a program contains UB, then the effects of this may be observed before the UB-causing statement is "executed". I.e. UB can travel back in time. For example, the compiler is allowed to remove a statement that happens before the operation that causes UB because the compiler is allowed to assume that UB never happens (or if it happens anything is allowed to happen) and can thus perform certain optimizations, such as dead-code elimination.

A call to an unsafe function may cause UB if the safety requirements are not met (edit: irregardless of whether that unsafe function is sound or not). This includes unsound functions but also applies to sound (but unsafe) functions. Edit: A safe function doesn't need safety requirements.

That depends on how/why the particular function is unsound. Either way, it's at least unsound. But I think you are right it might not always trigger UB. Example:

// This function is unsound,
// even if the whole program never causes UB:
fn unsound(arg: *const i32) {
    println!("Got number {}", unsafe { *arg });
}

// This program doesn't cause UB:
fn main() {
    unsound(&17);
    //unsound(std::ptr::null()); // But this would cause UB!
}

(Playground) (You can verifiy with "Tools", "Miri" that the program won't run into UB, even though the unsound function is unsound.)

Output:

Got number 17

But you can also be wrong:

// SAFETY: Never pass a null pointer.
unsafe fn unsound(arg: *mut i32) {
    // This function causes UB!
    let ref1 = &mut *arg;
    let ref2 = &mut *arg;
    println!("Got number {ref1}.");
    *ref2 += 1;
    println!("Now we have number {ref2}.");
}

fn main() {
    // SAFETY: We don't pass a null pointer.
    unsafe { unsound(&mut 17) };
}

(Playground) (Playground)

Output:

Got number 17.
Now we have number 18.

In the last example, we have an unsound function with some safety requirements. The function is unsound, so it may cause UB (which it does, in this case). Even if we call the function with a pointer that is valid for writing, we end up with UB. And we don't even notice it on Playground. Running the program seems to go fine. But if you go to "Tools", "Miri", it will show you that it's in fact UB. Thus future versions of the compiler or in certain scenarios, the program may fail or to weird/dangerous stuff.


P.S.: I'm not sure if it's generally a good idea to speak of "sound"/"unsound" when dealing with unsafe code. I feel like distinguishing between sound and unsound code makes more sense when dealing with safe APIs.

For unsafe code, I would probably rather use the wording "may exhibit UB".

Nonetheless, even if all your program never exhibits UB, it may still contain unsound functions which are a bad thing to have and should be avoided (because future editors of your program might unexpectedly run into UB when using these functions wrongly).

This is even more important for libraries. While you could add a "safety requirements" comment to a safe function, this should never be done! Safe functions are not supposed to trigger UB under any circumstances (unless you violate safety requirements in some other unsafe part of your program).

My fault, I use my private definition for term definition and execute.

  • The term definition in above post means: something defines how the calculation should be done.
  • The term call in above post means: passes the value and does the calculation.

for

fn main() {
    let p: *const i32 = std::ptr::null();
    let _x: &i32 = unsafe { &*p };
}

The definition is

 &*p 

The call is

 let _x: &i32 = unsafe { &*  std::ptr::null() };

Then, let consider the definition site: &*p is unsound,

/// Safety: p should be valid for any means
unsafe { &*p }

is sound.

Above is express confusion and it is not what I focused.

That is what I focused.
I agree with what you said above, with an addition that the compiler can't always have such a view to know the code it compiles contains an UB.

The code for example:

    fn drop(&mut self) {
        let p = self.ptr.as_ptr();
        unsafe {
            std::ptr::drop_in_place(p);
        }
        
        let layout = std::alloc::Layout::new::<&str>();
        unsafe {
            std::alloc::dealloc(p as *mut _, layout);
        }
    }

If drop is inlined, the compiler may be sure there is an UB for code below, (using more complicated logic to check, and it is a waste of time and CPU in most of time)

fn main() {
    let _foo;
    let s = String::from("123");
    _foo = MyBox::new(&*s);
}

But if drop is compiled to a independent function, then the compiler can't assume that: it has to compile the code in a way to ensure that if no dangling pointer passed to it, it should then runs as expected.

Given that, what dangling pointer passed to the same function? Will the same CPU instructions cause UB this time or I can predict nothing bad happens? It depends how the particular code is written and what CPU instructions compiler will compile the code into.

I prefer the latter, because it is relative reasonable for this particular code

A piece of advice: if you stop overloading/misusing established terms, your chance of getting positive feedback will be much higher.

It is in fact my English problem... I want to express myself more clearly too :rofl:

If this is actually the case then Vec and other stdlib types would also be unsound.

1 Like

What?

I haven't read through all the meta discussion regarding cultural and philosophical aspects of UB, but last time I checked the question whether invalid references (and what kind of invalidity) causes immediate UB was still somewhat undecided (which does indicate one should usually be better off cauciously assuming stricter rules, but it's maybe a bit oversimplified to claim "this is 100% definite UB" for many things around invalid references).

And on the specific question of calling std::ptr::drop_in_place on a *mut &T where the *mut points to valid memory containing a &T whose lifetime has expired, given precedent of many a #[may_dangle] usage in the standard library, though off the top of my head I'm not sure what precise source to point to to validate this claim, my take would be to give a clear "yes, that's fine, you may do that".

1 Like

The Rustonomicon mentions it in a small tangent in the section on PhantomData:

It's a scary feature, though.

Vec's Drop is implemented as:

unsafe impl<#[may_dangle] T, A: Allocator> Drop for Vec<T, A> {
    fn drop(&mut self) {
        unsafe {
            ptr::drop_in_place(ptr::slice_from_raw_parts_mut(self.as_mut_ptr(), self.len))
        }
        // RawVec handles deallocation
    }
}

Notice how the T may dangle, thus the ptr::drop_in_place may drop dangling references.

I was wondering if this is actually true and looks like MIRI seems to disagree with this! It finds UB when ptr::read is used but not with ptr::drop_in_place. I opened MIRI disagree with the documentation of `ptr::drop_in_place` · Issue #112015 · rust-lang/rust · GitHub for this.

3 Likes

Also I think that most of this discussion can easily be mooted: I don't think a reference type has drop glue.

            dbg!(core::mem::needs_drop::<T>()); // output: false

fn main() {
...
    let _b = MyBox(MaybeUninit::new(&mut s));

So OP can try this instead:

unsafe impl<#[may_dangle] &str> Drop for MyBox<&str> {
    fn drop(&mut self) {
        if std::mem::needs_drop::<&str>() /* always false */ {
          // SAFETY: Never executed.
          unsafe { drop_in_place(self.ptr.as_ptr()); } unreachable!("contents must not have drop glue");
        }
        let layout = Layout::new::<&str>();
        // SAFETY: We allocated the memory with a matching layout.
        unsafe { dealloc(self.ptr.as_ptr() as *mut _, layout); }
    }
}

needs_drop is allowed to spuriously return true for all types, and you're not intended to do a needs_drop guard around a single drop_in_place call (drop_in_place internally does the equivalent checks).

The real answer here is that #[may_dangle] is underdocumented magic which is more specific than and overrides that of drop_in_place when used. #[may_dangle] states that the only thing valid to do for the type parameter it is applied to is to drop_in_place it, and has essentially no documentation on what (if anything) it means when applied to a lifetime parameter.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.