Lifetimes of references exposed via FFI: is this UB?

Context: I'm exposing some Rust objects via FFI to other languages, in the form
of opaque pointers and functions that act on pointers to those objects. These
objects are freed by calling an FFI function (my_thing_free(thing_ptr)) on
each of them.

While doing some performance optimization, I found an unnecessary clone in a hot
path, and I'm trying to remove it. In order to do this, I introduced a reference
from one type to another which didn't exist before, and I'm now considering
exactly what happens if client code uses those pointers incorrectly.

In particular, what if the "outer world" code calls the _free() functions in the
wrong order? I.e. what it first frees the referenced object, and then it frees
the object that contains the reference?

Concretely, this is the situation after I replaced the clone with a reference:

struct InternalDetails(String);

struct MyThing<'a> {
    _details: &'a InternalDetails,
}

struct MyCollection {
    details: InternalDetails,
}

impl MyCollection {
    fn create_thing(&self) -> MyThing {
        MyThing {
            _details: &self.details,
        }
    }
}

These are exposed via various FFI functions, but let's only consider the _free()
functions here:

#[no_mangle]
extern "C" fn my_thing_free(thing: *mut MyThing) {
    drop(unsafe { Box::from_raw(thing) })
}

#[no_mangle]
extern "C" fn my_collection_free(collection: *mut MyCollection) {
    drop(unsafe { Box::from_raw(collection) })
}

This is some rust code trying to simulate what the incorrect outer world code
would do using the ffi functions:

fn main() {
    let collection = Box::new(MyCollection {
        details: InternalDetails("foo".to_string()),
    });

    // a pointer is created and handed off to the outer world
    // via FFI
    let collection_ptr = Box::into_raw(collection);

    // the outer world uses other FFI functions to create a thing from the collection
    let thing_ptr = {
        let collection = unsafe { collection_ptr.as_ref().unwrap() };
        Box::into_raw(Box::new(collection.create_thing()))
    };

    // C code does things with ptrs...
    //
    // Question is: what if the client code calls the two free functions
    // in the wrong order? Is it UB?
    my_collection_free(collection_ptr);
    my_thing_free(thing_ptr);
}

If the outer world code calls my_collection_free() first and then my_thing_free(),
the Thing would contain an invalid rust reference, from the moment it is
re-interpreted as a rust value (unsafe { Box::from_raw(thing) }) to the moment
when it's dropped.

At Behavior considered undefined - The Rust Reference I read
that it's UB to "produce" an invalid value, even in private fields and locals,
where "producing" means "a value is assigned to or read from a place, passed to
a function/primitive operation or returned from a function/primitive operation".

If I run the above code through MIRI
(Rust Playground)
it doesn't complain.

Am I right in my understanding that this is not UB, since dropping does not
match the rules for "producing" above?

Of course if I did anything that involves that dandling reference (for example
printing on drop) it would be UB, no doubts about that. But what about just
dropping the value?

Since I control the bindings in the other languages, I could enforce drop order
by keeping references in each object to the object from which it was created, if
there is a lifetime relationship on the rust side. But I would prefer to avoid
this extra (otherwise unused) reference, if this isn't hitting UB.

Any idea? Please do correct me if my reasoning is wrong.

Thanks!

Miri has no false positives, only false negatives. Therefore if Miri complains, that is proof of undefined behavior, but if it doesn't complain, that is not proof of correctness.

That said, your code does result in UB. A reference is always assumed to point to a valid value, therefore the mere existence of a reference that points to a deallocated value is UB.

I believe it does. Dropping a value necessarily conceptually reads it. But even if it weren't the case, the result of your Box::from_raw() still contains an invalid reference, i.e. the call most definitely produces an invalid value.


There are at least two ways you can avoid this:

  1. Introduce lifetime parameters in your types and functions (wherever relevant) that ensure that your wrapped type always outlives the wrapper. This results in slightly annoying code; for instance, this causes a well-known ergonomics "bug" that prevents transactions from being stored along with their originating connection in Rusqlite (the SQLite3 Rust bindings).
  2. Use reference counting to hold on to the wrapped value inside the wrapper; this will automatically ensure that the wrapped value lives at least as long as the wrapper. You'll have to be very careful as to not to expose any inappropriate raw pointers to the wrapped value.
3 Likes

If you want to do something like this, then I would recommend using a raw pointer instead. Those are allowed to be dangling, so it would definitely be okay in that case.

3 Likes

I am curious what kind of the harm an UB like turning a invalid address to a reference.

It is surely in the black box of compiling, while, if I get an idea of the detail, it helps a lot for me to avoid such UBs.

In my understanding, I can't image what the UB leads:

fn main() {
    let a = {
        let n = 1i32;
        unsafe { &n as *const i32 }
    };
    let r = unsafe { &*a };
}

Yes, maybe &*a may access the address which maybe freed, so what?

The a is an well aligned (integer) stack address. It is surely can be loaded safely, even it's content is unexpected. In fact I don't think it will generated MEM LOAD ASM, it is not necessary to load it.

Then why it is a UB even r is not used, or what happens if I keep the UB?

OK, maybe the compiler may notice this and lazily generates unexpected ASM code which does not effectively reflect the rust code, but I don't think the compiler can generate bad ASM for code like this:

fn main() {
    let a = func_in_another_crate_and_dynamic_linked();
    let r = unsafe { &*a };
}

fn func_in_another_crate_and_dynamic_linked() -> *const i32 {
        let n = 1i32;
        unsafe { &n as *const i32 }
}

The compiler does not know value returned from func_in_another_crate_and_dynamic_linked is bad, because the func locates in a dynamic library downloaded from the internet.

Ok, what it can do?

In principle, in func_in_another_crate_and_dynamic_linked(), the compiler could place n in a newly mapped page, then unmap the page right before the function returns. This would result in a segmentation fault. That's why we declare these operations as UB, so that the compiler has the freedom to modify its implementation without breaking unsafe code that depends on its current behavior.

@zylthinking1 Your question seems off-topic for this thread. I recommend starting a new thread instead of hijacking an existing one.

1 Like

Thank you. This is what I thought too, then saw Miri not complaining and I hoped I could avoid the complexity of handling this :slight_smile:

On the Rust side, I will keep the lifetime parameters of course. In the languages where this is actually used (Swift and Java), I will add a reference from "Thing" to the "Collection" it was created from, to ensure that a "Thing" being accidentally retained doesn't cause it to be deallocated (thus calling the FFI free() functions) in the wrong order.

Not quite. Stacked borrows, for example, doesn't necessarily represent what's actually UB today (at least not yet anyway), and miri is perfectly happy to flag memory leaks, which aren't UB at all, and sometimes are even the intended behavior. That being said, anything miri flags is probably a bug and should be fixed if possible.

It's undefined behavior, by definition it could do literally anything. If your program exhibits UB, summoning Cthulu would be perfectly valid behavior.

That being said, what it's likely to do is a different question. In the provided case, probably either a segfault, reading the correct value, or silently reading the wrong value out of some other structure's memory.

Thank you. My hope is that at some point we can "expand" the boundary between this library and the rest of the code base, and then these function will be called from Rust code directly. So I'll keep using Rust references with all their lifetime guarantees :slight_smile:

In the caller code (which is in Swift and Java), I will add a reference from each Thing to the Collection it was created from, to ensure that the deallocation order is the correct one.

Only mem load will cause segment fault.
However, translate a pointer to reference does not in fact involve a mem load

I disassemble the generated binary, showing

> 0x000055555555b960 <+0>:	sub    $0x4,%rsp
   0x000055555555b964 <+4>:	movl   $0x1,(%rsp)
   0x000055555555b96b <+11>:	add    $0x4,%rsp
   0x000055555555b96f <+15>:	retq   

So, there is no memory loading. So

It is a UB even the reference is not used at all

means nothing to me in this example. It is an UB, but it does nothing --- and I feel comfortable, I get to know what it does and what it doesn't, then I am not afraid of the unknown fate.

Surely, it possibly a cause of bad things. But, maybe more explanation is needed, the current saying it is UB and you don't need detail of UB, only keep away from UB, give me nothing except making me afraid of it ---- I don't know whether it may kill me by changing into a bomb.

Just like the old chinese adage: The more unclear the punishment rule is, the more powerful it becomes. The unclear UB is endless powerful.

UB is a language definitional construct. There's no such thing as "not-actually UB". If it's not defined (or explicitly defined to be erroneous), then it is UB by definition; what the compiler happens to do today doesn't influence whether it's defined or undefined in the abstract. The compiler is an implementation, not a formal model; Stacked Borrows is a formal model, and as such, what it defines and doesn't define is crystal clear. Now whether the compiler actually uses all of it (it doesn't, currently) is an entirely different question.

Of course – but I'm pretty sure Miri doesn't say that memory leaks are UB. My point is that what Miri itself flags as UB is UB.

What leads to a memory load or a segmentation fault has nothing to do with what is considered UB by the language. UB isn't a synonym for "crash".

1 Like

Sorry for this.

I in fact just want to know what harmful thing the UB in the code leads:

extern "C" fn my_thing_free(thing: *mut MyThing) {
    drop(unsafe { Box::from_raw(thing) })
}

Yes, I know it is UB, and UB can do anything unexpected, but I don't know what unexpected thing rises in this particular example, which make me, as previous post, frightened.

I don't see any UB in this snippet, unless you're calling it with invalid pointers or something like that.

1 Like

Yep, stack borrows is a formal memory model, but it's not officially the formal model. Currently, rust doesn't have one. What's UB in stacked borrows is only probably UB in rust. To quote the too many lists page on stacked borrows:

NARRATOR: Stacked borrows are still "experimental" as a semantic model for Rust, so breaking these rules may not actually mean your program is "wrong". But unless you literally work on the compiler, you should just fix your program when miri complains. Better safe than sorry when it comes to Undefined Behaviour.

This may change if and when stacked borrows does become rust's official memory model, but for now, a stacked borrows violation that still follows llvm's scoped noalias rules and mutability rules is still a technically correct but absolutely terrible idea.

The point of UB is that you don't know, because you can't know. You should be scared of UB, because when it happens, literally anything could go wrong. It's the cause of a significant portion of the world's security vulnerabilities.

However, the example you've provided is just fine, that's the idiomatic way to handle freeing objects when using FFI. All you have to do is clearly document that your function is freeing the provided pointer, so callers should ensure they pass a valid pointer given by my_thing_new (or whatever). When it comes to creating FFI APIs, all you can do is try to create the least surprising API possible, document it clearly, and then cross your fingers and hope the caller doesn't do anything too stupid.

When it comes to discussing miri errors and stacked borrows, we have to separate things into three categories:

  1. This is definitely ok.
  2. This is definitely not ok.
  3. We are still trying to decide whether this is ok.

Miri never reports UB for anything in the first category (or at least its a bug in miri if it does), but it will sometimes report UB for things in the third category.

That said, using things in the third category is pretty sketchy regardless, since it might become "definitely not ok" in the future.

3 Likes

The main post:

Is there an UB? yes. Can the UB produce anything harmful, seems NO.

Every one talks about there is an UB, but no one tells what the UB leads to. what bad happens if keep the code just as it is.

I really want a detailed document explaining why UB is UB and how it does bad things, and more important, when it can do nothing bad although it is UB in principle.

The key thing is that if you enter a path that leads to UB, you lose the ability to meaningfully reason about your program. It could do anything and everything, no matter how small or inconsequential your UB seems.

In practice, it tends to lead to problems such as heisenbugs, segfaults, memory corruption, ace exploits, your business logic getting optimized away, time travel, and much more. An experienced programmer might be able to tell you what your erroneous code is likely to do, but they can only guess, not guarantee.

2 Likes

Sorry, I misunderstood because the lifetime is missing on the MyThing type.

Why is the snippet UB? If you free MyCollection first, then MyThing has a dangling reference and this is generally said to be UB. That said, references are only required to be valid until their last use, so the question is whether calling the destructor of MyThing counts as a use of the reference. If it does, then it is UB, otherwise it is not. I don't actually know whether it counts as a use, so its possible that it isn't actually UB.

Why are dangling references UB? Because Rust says so.

Why does Rust say so? Because Rust tells LLVM that the pointer is dereferencable to enable more optimizations. This means that the compiler is allow to insert a read from the pointer anywhere even if your original code doesn't have a read. This is sometimes useful for optimizations.

How can this be bad? Well, if the reference is dangling and the compiler inserts a read, then you probably get a segfault or other bad stuff.

What bad stuff happens in this particular example? Probably nothing. The optimizations I know of that make use of the dereferencable marker would not be applied to the particular code snippet we're talking about.

Why should I care about the UB then? Well, if there's a simple change you could make that makes your code be definitely ok, why would you not make that change? I very strongly prefer "definitely ok" over "sketchy but will probably work".

5 Likes

To give an example of how optimizations might introduce a read even if there isn't one, consider this function:

fn example(b: &u32, num: u32) -> i32 {
  let mut acc = 0;
  for _i in 0..num {
    acc += *b;
  }
  acc
}

If unused references being dangling is ok, then calling example(dangling_ptr, 0) would be ok.

However, the compiler probably wants to optimize this into the following:

fn example(b: &u32, num: u32) -> i32 {
  let b_val = *b;
  b_val * num
}

But now example(dangling_ptr, 0) suddenly does perform a read of the pointer, even though it didn't before. This could result in a segfault.

(Note that moving a reference counts as a use, so in Rust terms, the original example does have a use of b, and so it is required to be valid even if it isn't actually used by example.)

This kind of stuff is why Rust requires references to not be dangling even if you don't use them.

8 Likes