UB due to unreachable branches

Am I correct that this code

fn main() {
    let i: i32 = rand::random();
    if i > 0 {
        let mut x = 10;
        let p1 = &mut x;
        let p2 = unsafe { &mut *(p1 as *mut i32) };

        println!("{p1}, {p2}");
    }
}

has UB because the branch with the possible UB (two mutable refs at the same time) can be reached (i can be larger than 0) while this code

fn main() {
    let i: i32 = -1;
    if i > 0 {
        let mut x = 10;
        let p1 = &mut x;
        let p2 = unsafe { &mut *(p1 as *mut i32) };

        println!("{p1}, {p2}");
    }
}

has no UB because the branch with the possible UB is unreachable (i is always smaller than 0)?

If the compiler threw out the entire if i > {...} because it detects that it can never be executed, and it did this very early in the compilation process, then I think there might be no UB. But I don't see how this could be guaranteed. I wouldn't count on it, anyway.

1 Like

But UB doesn't depend on compiler details. It's a specification, a rule.

Just running your snippets on the playground, Miri detects UB in your first example, but not in your second. As to whether we can generalize from this that dead branches can never cause UB or not, I don't know.

3 Likes

No you cannot, because Miri can only test code that it actually executes. It's just a state machine that checks that the code in a single execution conforms to stacked borrows or tree borrows. For example it also doesn't detect UB on the first example when i happens to be negative but the first example clearly has UB.

2 Likes

I was thinking since unreachable_unsafe() is only UB when it can actually be reached that this holds true for all UB.

In the second example no references are taken, so the behavior is well defined. UB only happens if a reference is actually shared, not when it only looks like it might.

What? The first and the second example are the same except for the definition of i.

Unreachable branches can't cause UB in reachable code, no.

The question is a little ill-formed (as I understand it) because undefined behavior is precisely that set of things which, if they happen in code, allow the compiler to assume that code is unreachable so as far as the compiler is concerned, "reachable undefined behavior" is inherently contradictory. It's because UB is unreachable by definition that it causes surprising compiler output like deleting all your code or running things apparently out of order.

If the code with supposed undefined behavior is indeed unreachable, it has no undefined behavior because it has no behavior at all.

6 Likes

(This does imply that the compiler can assume the branch is never taken, so if there were any additional code in the if i > 0 block it could be removed (even before the println). But if there were additional code after the if, it cannot be removed, because the compiler is obligated to produce an executable with well-defined behavior if random returns a nonpositive number.)

2 Likes

I don't think so, because I called it possible UB, but I think that's a detail.

I don't think that is true. The code either has UB or not. It doesn't depend on the runtime.

My question was only if there is no UB if the offending lines are unreachable and your answer seems to be yes.

In the first example there is no UB in cases when the random number happens to be negative. There is UB when it is positive.

3 Likes

It is true. There is no contradiction in saying "this program has undefined behavior when runtime conditions X, Y, and Z are satisfied", in fact, that's the vast majority of undefined behavior. Programs with conditional undefined behavior are still required to behave well when the conditions are not met.

2 Likes

Does someone have a citation for this? I have never read something like that. But I must admit that I haven't read the exact opposite either, so I am unsure.

I don't think it's contradictory. The first example has reachable UB that will actually happen 50% of the time.

The definition of UB is that when it happens, the behavior of the program is not defined. Anything can happen, no guarantees about what will happen.

What it implies in practice for the compiler is that it can assume it will never happen. Doesn't mean it will really never happen, it just means that when the assumption is wrong, the compiler has no responsibilities, so it doesn't need to worry about it. So might as well make that assumption.

Rust doesn't have a good specification for this in the language reference, but here is what the C++ standard says:

undefined behavior
behavior for which this document imposes no requirements

In other words: the language does not say what will happen when you take some action X that the language says has undefined behavior (e.g. share a mut reference). The behavior might be that your computer explodes or whatever -- there is no requirement.

That just doesn't apply to cases when you don't do that bad thing. So in those cases regular language rules apply.

3 Likes

This is a terminology issue.

UB is a property of an execution, as the existence of assert_unchecked in std::hint - Rust is proof that it's fine for UB to "exist" so long as it's not "reached". (Scare quotes because of that being a fuzzy statement. It's more about the Dominator (graph theory) - Wikipedia relationships between the CFG nodes, rather than actually "hitting" the exact UB.)

I think what you want to say here is that the first code is not sound: some executions have the potential to reach UB, even though there are also some executions that are well-defined because they don't hit UB.

It's easier to talk about if you just make it a function instead:

fn this_is_unsound(i: i32) {
    if i > 0 {
        let mut x = 10;
        let p1 = &mut x;
        let p2 = unsafe { &mut *(p1 as *mut i32) };

        println!("{p1}, {p2}");
    }
}

It's possible to call it in ways that do not hit UB, but it's not sound because it's possible for safe code to call it in a way that does hit UB.

10 Likes

Then every program is unsound because a neutrino can change any instruction to an illegal one. That's not a useful definition.

UB is a language-level construct, and the Rust abstract machine doesn't have random bit flips.

3 Likes

Do you know if this google document on UB and soundness in Rust is generally considered accurate (or as accurate as one can get right now) by the Rust dev teams? Maybe that sort of consensus is too much to ask. I'm just looking for something that is mostly accurate that I can refer to (in addition to the nomicon of course).

2 Likes

That page looks essentially correct to me. Nothing here should be surprising to people used to the same things that apply to C++ as well, though of course C++'s specific rules are very different. (And there's lots of "works"-on-my-machine C++ code out there.)

3 Likes