Not unsafe operation in unsafe block

Can safe operations within an unsafe block that do not need to be in the unsafe block lead to bugs or undefined behavior?

That's a very abstract question, but I would say the answer is yes. The invariants needed to make an unsafe block sound don't necessarily end at at the unsafe block. Often they end at some higher privacy boundary, such as the containing module.

4 Likes

Not directly; unsafe blocks have only one effect— To enable the use of unsafe operations. The only downside of including additional safe operations is that it obscures which operations are actually unsafe and therefore require higher scrutiny from reviewers/programmers.

6 Likes

That's an easy one: yes. Demo.

No.

Or at least, if it leads to a bug, then the bug would also be there if you moved the safe operation outside the unsafe block. So the unsafe block cannot be the cause.

4 Likes

That segfaults without the println though.

So I disagree with your conclusion, I think the answer is no... caveat unsafe requires guarantees to be upheld for it to remain sound (I think I'm using that term appropriately here)

I don't think that's correct. Safe expressions (either inside or outside of unsafe blocks) can lead to UB if there's some other, corresponding, incorrect unsafe code. More precisely, in this case, the immediate manifestation of the error may be located in safe code (as in my example above). One can argue (and I'd agree) that as far as the root cause is concerned, it's still the fault of the unsafe code.

2 Likes

No, you are wrong. That's trivial to "fix": Playground – this doesn't segfault (nor does it get flagged by Miri) if you remove or comment out the println.

1 Like

Well I'll be, fair enough!

I guess something in the drop implementation of Vec leads to a segfault, as well as printing (or doing almost anything else with the Vec)?

set_len(1024) -> end of main() -> SEGFAULT

set_len(1024) -> set_len(0) -> end of main() -> OK

set_len(1024) -> print -> SEGFAULT

We're building a distinction between:

  • violating constraints that safe code relies on (in the playground that's setting the Vec's len > it's capacity)
  • Triggering undefined behavior

It seems once the violation has happened, safe code can lead to undefined behavior - regardless of whether the safe code is inside or outside an unsafe block.

2 Likes

Yes, dropping uninitialized memory is 100% UB.

Yes, of course, that's the whole point. A crash or the triggering of UB can't only happen within a literal unsafe block. Incorrect unsafe can have arbitrarily bad consequences, including action at a distance. If you think about it, this really couldn't work in any other way – after all, all safe code ultimately has to rely on something (compiler) or someone (programmer writing unsafe) proving some invariants in an unsafe context, because a real machine is much less constrained (more lenient) in what you can (ie., allowed to) do with it, some of which are just plainly incorrect. And the leaf of the tree, so to speak, is always in such a low-level, inherently unsafe context.

Similarly to how a compiler bug doesn't necessarily mean that the compiler crashes, a bug in unsafe doesn't mean that the code crashes in an unsafe block.

2 Likes

Extremely nitpickily: moving code in or out of a block (unsafe or not) can change drop order and scoping so as to introduce or remove a bug. So, I think the most pedantically precise answer to the original question is that changing

{
    some_safe_code();
}

to

unsafe {
    some_safe_code();
}

or vice versa will never introduce a change in program behavior, but when you consider moving code, you have to consider the consequences of the block (whether it is an unsafe block or not).

4 Likes

Being even more nitpicky; assuming the program compiles at all, adding or removing unsafe cannot change the behaviour. But if some_safe_code is actually unsafe fn some_safe_code(), then removing unsafe can make it fail to compile.

Well, yes; I meant some_safe_code(); to stand in for any statements at all of actually safe code.

1 Like

The thing to understand is that unsafe code can rely on properties of safe code, so you could make a change outside an unsafe block that causes Undefined Behaviour if the unsafe code is relying on it,

That said, it is normal to encapsulate unsafe operations in modules (using a suitable interface) so that whatever a client of the module does, there cannot be Undefined Behaviour. When this is done correctly the module is called "Sound".

4 Likes

…and to be precise, that's a technical "can" (ie., it is possible to write such code), and not a legal one – ie. you are not allowed to write such code. Since changes in safe code alone shouldn't ever cause UB, by definition any unsafe that doesn't fulfill that criterion is unsound/incorrect.

That's not how Rust is actually used. If it were, unsafe code couldn't use any of std except the unsafe parts, and we'd be marking code unsafe just to be allowed to use it from other unsafe, which would be absurd.

Unsafe code should not rely on safe code that could be supplied outside of its control (callbacks, trait implementations, Drop behaviors in generic code), but practically, it must and should use safe code when it is confident in the correctness (not only soundness) of that safe code.

8 Likes

Yes.

The whole point of having limited-scope unsafe blocks is to minimize the chances of bugs. Therefore, taking the contra-positive, unnecessarily increasing the scope of unsafe blocks will increase the chances of bugs.

In the extreme, C programs can be considered to consist of one big unsafe block and that increases the chances of bugs.

Increasing the scope of unsafe blocks increases the number of places where you can make a mistake because more things are allowed in unsafe blocks that aren't prevented by the compiler.

2 Likes

I don't agree there. Here is a practical example: suppose I write an interpreter for some programming language, which parses the target language, does type checking, and generates instructions which are then executed to run the program.

The interpreter could use unsafe code to optimise the execution, on the basis that the instructions are correct, for example it could use match statements with "unreachable_unchecked" for conditions that should be impossible, however if the (safe) code which generates the instructions has a flaw, or it is changed so it is not correct, then you will have UB, due to a flaw in the safe code.

( I actually did all this, although I never actually triggered the UB as I didn't run with the unsafe code enabled, but I could have done, here is an example of the unsafe code ( "unsafe_panic!" is a macro which potentially calls unreachable_unchecked! if a feature is enabled ).

1 Like

Alas, that's not very important nor useful. The language designers, maintainers, and the wider ecosystem agrees that unsafe should never rely on correctness of safe code for soundness, which is unsurprising: your definition would basically render any and all definition of "safety" completely useless and devoid of meaning.

UB in unsafe code resulting from safe code not doing as promised is still the fault of the unsafe code for not being vigilant enough. If this weren't the case, then safe code would no longer be safe at all, since modifying arbitrary safe code could cause UB at any time.

No, this is emphatically false.

See e.g. the BTreeMap docs:

You are grossly misrepresenting my argument.

  • First off, std is special. It's supplied by the authors of the language. If it's broken, then everything is broken.
  • Yeah, that's exactly the point. Of course, you can guarantee the behavior of safe code that you've written yourself, but unless you are dtolnay, you haven't written most of the code in the ecosystem, so the overwhelming majority of safe code is not something you can rely on in your unsafe blocks.