Examples of undefined behaviour in Rust


#1

I’m rewriting a large chunk of my FFI guide and at the end I’m including a “fun” chapter on all the ways you can break stuff when interoperating between languages.

The exercise will be as follows, each problem will contain the source code for a small program which deliberately does something horribly wrong, incorrect, or dangerous (memory safety, data races, undefined behaviour, that sort of thing). It’s then your job to figure out what the issue is and why it could end up hurting your application.

I’m looking for as many examples of undefined behaviour, memory issues, and other common mistakes as possible so we can pose them as problems and discuss what is happening and why it may end unpleasantly. If you can think of any good examples of this, please post them here :slight_smile:

As an example, one of my “problems” is where a Rust function called by C++ panics. Then in the solutions we’ll go through saying why panicking across the FFI boundary is UB and ways you can prevent it.

(Rendered Guide)


#2

Using a Rust c-like enum and getting a value that is not a valid value for that enum.

For example

#[repr(C)]
enum Foo { A = 1, B = 2, C = 3 }

If we receive a value “4” in this type through ffi, that is UB. Foo can’t represent values other than Foo::{A, B, C}.


#3

I just tried that on the playground and it looks like rustc translates the match statement generated by Debug into a chain of if ... else if ... else statements.

use std::mem;

#[derive(Debug, Copy, Clone)]
#[repr(C)]
enum Thing {
    B = 1,
    C = 2,
    D = 3,
}

fn main() {
    let zero: Thing = unsafe { mem::transmute(0) };
    let four: Thing = unsafe { mem::transmute(4) };

    println!("{:?}, {:?}", zero, four);  // output: D, D
}


#4

You could try using https://doc.rust-lang.org/reference/behavior-considered-undefined.html as a checklist


#5

It depends, I’ve also seen cases where an unreachable intrinsic is executed (UD2 on x86). This is of course the point of UB, anything might happen.


#6

For the time being at least, you cannot in any way rely on any behaviour here. UB means UB.


#7

Well, does your foreign API initialize anything by reference that needs to implement Drop?

Then you could use std::uninitialized to “initialize” a variable which the C function is supposed to initialize, and then neglect to std::forget the value when the C initialization function fails, so Rust ends up dropping an uninitialized value. That’s fun.


#8

To be fair, the same problem would happen in C++. In fact, all of these would still be bugs in C++.


#9

That’s part of what I’m looking for. A lot of these kinds of undefined behaviour are going to be UB in any language, it’s just Rust programmers are used to not having to worry about them because the borrow checker wouldn’t let it compile. Meaning we might accidentally do something that is UB, but it’ll still compile because we’re using unsafe or happen to be writing C++ at the time.

I’ve also included examples where you forget #[no_mangle] or use the wrong calling convention. They’re not exactly UB, but it’s still worth showing what a linker error looks like or what can happen when you call a function with the wrong calling convention.

I also want to show that the “unicode” strings (actually UTF-16) you get from the winapi aren’t compatible with a normal C-style char * or Rust’s &str. The problem is winapi is (obviously) a windows-only thing, and I want this guide to be as cross-platform as possible.


#10

I just created an issue on the FFI guide GitHub repo for brainstorming different FFI footguns. If you have any suggestions, make a comment over there and we can add it to the list :slight_smile: