Undefined behaviour after unsafe enum usage


#1

Hello everyone,

I started learning Rust last week and, in the process, I am writing a few small programs so that I get a better feel for the language. One of the test applications I wrote started crashing with Illegal instruction after changing the input. I managed to isolate the issue as being caused by a pattern match on a enum with an invalid value.

The enum value is the result of some unsafe code in one of the crates I am using.
I was unpleasantly surprised by this crash as it happens in my code, which doesn’t have any unsafe blocks, long after calling the external API, with no other error information than Illegal instruction.
After looking for more information, I saw that generating invalid enum values results in undefined behaviour, so anything can happen at that point.

What I wonder is why isn’t the compiler doing more to prevent this kind of crashes from happening silently and so late after the fact?
After each unsafe block that generates enum values, those values could be check for validity and then a panic can occur with a nice error informing about the cause. Or at least all match blocks without a default handler should panic when an invalid value is supplied.

Is there a way I can check an unsafe enum value is valid manually?

Thanks.


#2

If you aren’t using any unsafe, then there is a bug in the dependency you are using. Could you please show us the code?


#3

Not exactly; at that point it is already too late because the invalid value (for that type) has already been created. In a way, the illegal instruction that was tripped was a “safety” check that did catch this for you, even if it seems violent. It doesn’t prevent other ill effects to have occured due to the invalid value before that point, though.

That it’s hard to use enums correctly in ffi is certainly something that known and there is work ongoing to improve that.


#4

There is a bug in the sense that the crate uses unsafe code to cast an integer from an external source to an enum that isn’t fully defined in Rust code. I already made a PR to add the missing fields.

My question is more towards why isn’t the compiler crashing the code right after the unsafe code finishes executing? That would massively help in finding the error faster and making the program more deterministic.


#5

Its not possible to automaticity generate check code for all data generated in unsafe to offer a guarantee it is safe after leaving the block. Code in most cases would be needless performance killer.

You can cast C style enums with as i32;


#6

I have trouble seeing why that would not be possible. The compiler knows all the possible legal values for the variable so it can check the validity of the result.
Regarding the performance: Rust already prioritizes safety in certain conditions (for ex. array bounds checking). If one wishes to avoid the check, they can produce u32 types instead of an enum.


#7

unsafe exists specifically to tell the compiler “let me do this; trust me”. If they’re unsafe-casting a foreign number to an enum, that’s a bug; either they need to expose the enum and take it directly, or they need to check the number before casting.


#8

It is more the case of the compiler knowing the variables to check, unsafe can easily operate on memory that the compiler has no knowledge of association to a variable.


#9

I understand that and I wasn’t proposing otherwise. One should be able to do anything they want in an unsafe block, but they also need to leave the final program state valid.
My suggestion was that, once that the unsafe block terminates and the execution transitions back into the safe world, some validations should be done, as much as it is possible, in order to limit undefined behaviour.
For a language that promotes safety, that seems reasonable.


#10

It’s conceivable that the compiler could do more UB checks in, say, debug builds (similar to how it traps signed overflow). Maybe once the unsafety/UB story is formalized, some additional tooling can be built (either in the compiler, a UB sanitizer or something else).

But as others have said, you’re on your own when you wander off into unsafe territory today. Why isn’t that code in question using a custom conversion function to translate from an int to an enum variant? As @bluss mentioned, enums in an FFI are a footgun at the moment.


#11

What you were hitting here, “illegal instruction” is possibly something just like that, that you will only see in debug builds, you can reproduce it in the playpen and looking at the generated code for a match: (playground link)

The debug version of the code has a fallback if none of the match arms match (that you can see as unreachable, !dbg !21 in the llvm IR and ud2 in the asm; you’ll see “illegal instruction” if the ud2 instruction is reached). This is normally removed when optimizing.

@bcristi It’s absolutely a good idea to find a pragmatic way to check this better, but the usual constraint is that zero-overhead unsafe code should be possible.


#12

There is no such thing as a safe world. The meaning of unsafe is far more subtle than that.

The only possible take-home message I would have from such an experience is that this crate seriously needs to be audited. Taking an arbitrary integer from an external source and converting it into an enum is something you do. not. do, as is handing out a *mut to an enum. Forget the the compiler, I’d be after these guys with pitchforks!
(uh… constructively, of course!)


Checking invariants upon exit from an unsafe block is nonsense. Behavior considered undefined for unsafe code may describe circumstances that extend arbitrarily far into the past or the future, and into arbitrary safe code (or at least, arbitrary as far as any local analysis is concerned).

Here are two equally legitimate unsafe abstractions. At first glance they may seem equivalent, as both have the exact same methods and implementation. But the difference in placement of the unsafe keyword makes them ultimately very different.

The first is not too different from a raw pointer; here, a Ptr is constructed safely, and dereferenced unsafely. The caller of as_ref must use an unsafe block to declare full responsibility for all Undefined Behavior concerning the interaction between as_ref and new:

pub struct Ptr(*const i8);

impl Ptr {
    pub fn new(p: *const i8) { Ptr(p) }

    /// # Unsafety
    /// Behavior is undefined if the pointer given to `new` did not
    /// point to an `i8`, or that pointer is not valid for the implicit
    /// lifetime in this call.
    pub unsafe fn as_ref(&self) -> &i8 {  p.0.as_ref().unwrap() }
}

In the second, it is instead the caller of new who must assume responsibility for Undefined Behavior concerning this interaction. Clearly, one must be very careful about where this Ptr is stored and how long it lives! Such an object clearly cannot be safely communicated beyond the crate API boundary.

But just because it’s unusual does not mean it is illegitimate. Unsafety comes in all sorts of flavors.

pub struct Ptr(*const i8);

impl Ptr {
    /// # Unsafety
    /// Behavior is undefined if `p` does not point to an `i8`.
    ///
    /// Further, you must guarantee that all possible future calls to
    /// `as_ref()` only produce lifetimes which are valid for that `i8`,
    /// or else the behavior of this method is undefined.
    pub unsafe fn new(p: *const i8) -> Self { Ptr(p) }

    pub fn as_ref(&self) -> &i8 { unsafe { p.0.as_ref() }.unwrap() }
}

You see, unsafe is not a door into a whole new world, nor do you ever leave unsafety.

It’s just a contract. All unsafe says is that you accept full responsibility for your actions.

You know, the usual stuff.


#13

The fact of the matter is that the compiler has enough knowledge to (theoretically) verify bogus discriminant values placed into an enum - I don’t see this much different from trapping on signed overflow, as mentioned upthread. If anything, the enum case is always UB whereas signed overflow may be a sign of a bug but is otherwise defined. Comparing this to raw ptr wrappers is a bit disingenuous since compiler can’t reason about your intentions there (you may want a null) and it can’t know about the lifetime of it. I understand that it’s trying to illustrate a point about what unsafety means but I think the focus of this thread has really been around the enum case and although some statements about unsafety sound more general than that, I think they’re really talking about enums.


#14

The compiler already does as much as it reasonably can for enums.

  • The code in question gives a *mut Enum to a C FFI code, which populates it. There is no point at which Rust has the chance to prevent this value from being written. It has no idea where that *mut Enum is going or what is being done to it.
  • As mentioned by @bluss, when the Enum gets matched in debug mode, it traps on an invalid discriminant.

The author is saying the compiler should check on leaving the unsafe block. That is what I’m calling nonsense.


#15

@vitalyd is right and I was referring to enums only, as that seems doable.
I wasn’t talking about enforcing general memory safety and I wasn’t suggesting that generating the invalid value should be prevented, I was suggesting that at the end of the unsafe block, touched enums should be validated.

I find the trap on invalid value in debug mode has very poor usability: there is no file/line information and no mention on what the invalid value is. And good luck debugging that on the release build.

But I also admit that I don’t know enough about Rust yet, so I won’t push my point further.


#16

Yes, @bluss’s point is taken about the compiler leaving a trap on matches. The discussion is whether better UX can be made. It’s possible that the trap is enough of a breadcrumb for someone to piece things together but it certainly won’t be obvious in a large enough context. As he also noted, the constraint is zero-cost FFI. If that can be relaxed for some modes (debug, a sanitizer, etc) then I suspect something better might be achievable.

And there’s absolutely nothing wrong with asking if the compiler can assist better with detecting this. Picking on a sentence to call it nonsense, when I think it’s quite clear what the real intent is, is uncalled for and doesn’t advance the discussion in any useful manner.


#17

In all fairness I might just be expressing pent-up frustration at the fact that unsafe { } is not required in an unsafe fn, which I feel is directly connected to this misconception that there’s anything special about leaving an unsafe block.

(and it now even triggers a lint! All this does is make it harder to document the conditions for UB…)


#18

Isn’t the primary issue that arose in this thread simply a request for tooling that would help validate an enum value received from FFI, determining whether the received value was or was not the representation of a defined member of the enum set?

That validation check would not be zero-cost; its implementation inherently would require run-time processing. The purpose of that validation check on return from FFI is simply to limit the propagation of UB into the following code.

If the set of valid enum values was a dense subset of uN (i.e., u8, u16, …), that validation could compile into an inline check against the lower and upper bounds of the representation, with LLVM removing any instruction subtracting a zero lower bound. If it was a sparse subset of u8, the preceding bounds check could be augmented with an index into an enum-type-specific bit-packed boolean array representing all the valid enum values.

If the set of valid enum values was a sparse subset of u16 or larger, the check could compile into a call to a non-inline binary search of a sorted array of all the legitimate enum values.