How is locking a Mutex not unsafe?

In Rust, Undefined Behavior is a specific term. It means undefined in the mathematical sense: an execution of a Rust program by definition performs no undefined operations. If your Rust source describes an undefined operation, then the execution which includes the undefined operation is also undefined in the mathematical sense.

This is what makes Undefined Behavior so problematic. We don't live in a mathematical model, and the Rust compiler functions by producing some bytes which instruct the processer to do some processing. Something occurs if you execute the resulting binary with the correct environment to cause the interpretation of the Rust source to do the undefined operation.

Unspecified, however, only has its informal meaning. But this doesn't mean that a function whose behavior is left unspecified has completely unbounded possibilities — if the function is unsafe, then fully unspecified behavior includes Undefined Behavior, but a safe function is guaranteed[1] by the language not to cause any Undefined Behavior for any input[2].

The key point of Rust's unsafe system is encapsulated unsafety. Running completely arbitrary and unspecified Rust code in one crate (in the absence of Undefined Behavior) mathematically cannot impact the execution of a different, unrelated crate. This is the thesis, so a pull quote for emphasis:

Unspecified behavior cannot impact the behavior of code in a separate crate.

Unspecified behavior is unspecified, but it is bounded to operations that the language allows the library to perform. This includes arbitrary global state changes, but only global state that is upstream to the unspecified behavior. Additionally, the privacy barrier of the crate ensures that only the global state of the crate with the unspecified behavior can change in fully unspecified ways — any crates upstream of the unspecified behavior can only be interacted with in a sound manner provided for by its public API. And any crate which is unrelated to the unspecified behavior will not have any global state change.

It is at this point that we need to assume Quality Of Implementation. Because std is distributed as part of the language, it could have special powers — we must assume that a correct implementation behaves indistinguishably from if it had no special powers. Similarly, because std is a single crate[3], we must assume that the impacts of unspecified behavior are bound to the object which misbehaves.

It is, ultimately, that last point which is under discussion. It is exactly about providing some bound on the unspecified behavior. The answer, as it is today, is that the unspecified behavior is bound only by Quality Of Implementation.

Although the std documentation purposely and explicitly leaves the behavior of recursively locking a Mutex unspecified, the standard library does (informally and implicitly) guarantee a reasonable Quality Of Implementation that behavior is localized.

The Rust project has a very liberal policy that if you think there's a bug, there's almost certainly a bug. It might be considered a documentation bug that the behavior wasn't documented sufficiently, but if the behavior of the standard library surprises a reasonably informed and conscientious Rust developer, that is a bug to be addressed.

This isn't sufficient for a mathematical proof of correct behavior in the face of arbitrary use[4], but is far and beyond sufficient for normal everyday development to assume reasonable QOI from the standard library.

The difference to C++ is that the C++ standard and specification is a formal document. If the standard omits a definition for some behavior, then it is mathematically Undefined Behavior by that omission. Rust does not have an official formal specification at this time. Of the official reference material, the Rustonomicon provides a semiformal description of what Undefined Behavior is in Rust. If behavior is omitted from the standard library documentation, that behavior is merely unspecified by omission — it is a passive guarantee that no Undefined Behavior occurs if you satisfy the documented safety conditions of the unsafe APIs.

Could we do better? Yes, obviously. We can do a better job of defining Undefined Behavior[5], we can provide a usable definition of the dynamic borrowing rules, we can avoid the term unspecified as too similar to Undefined, we can provide tighter bounds on arbitrary misbehavior where we explicitly allow it, we can do many things. This is mostly a matter of project throughput and figuring out how to structure and deliver this information such that it actually addresses the pain points.

Here's a reasonably short differentiator: If your program manifests arbitrary but defined misbehavior, then you can use Rust tooling to diagnose it. If your program manifests Undefined Behavior, then the Rust tooling can become useless and you may need to diagnose the misbehavior at the next level on the tower of weakenings. (However, due to an immense amount of work, a smattering of good luck, and the practicality of discrete execution rather than pure mathematical models, the tooling built around C/C++ and which extends to Rust will often work well enough even in the face of Undefined Behavior.)


  1. This is unfortunately an oversimplification: the actual property which the language guarantees is that all executions do not cause any Undefined Behavior. It is a property of the library that safe functions cannot be used to cause Undefined Behavior.

    The definition of the language and the standard library are unfortunately very intertwined. For C and C++, they are quite literally the same thing. For Rust, the standard library gets special permissions not available to other code by virtue of being distributed with and bound to specific distributions of the compiler. ↩︎

  2. There are, of course, some caveats. If you never write unsafe, none of them apply. The most notable caveat is the concept of safety invariants and the ever-problematic library UB. ↩︎

  3. This isn't actually true — the standard library implementation is broken down into (and exposed as) three different crates (std, alloc, and core), and internally uses other crates. However, due again to its special privilege, this is all essentially one crate for the purpose of coherence and safety boundaries. ↩︎

  4. And basically nobody does that, even softward certification programs. Software certification is typically restricted to proving correctness when used correctly. Rust is unique in providing such strong guarantees in the face of misuse (where the guarantees are possible to break in the first place). ↩︎

  5. For better or for worse, Undefined Behavior is the term of art for mathematically undefined behavior which the compiler assumes does not happen; replacing that term would do more harm than good. ↩︎

11 Likes

There are things that are well-defined that wouldn't be considered even remotely memory-safe. Let's say I'm working with memory-mapped files on Linux. In other words, I'm working with an array of bytes that could potentially be modified or deleted by any process at any time. There is nothing even remotely safe about it. Of course, none of that matters if the rust virtual machine either can't observe that behavior. For example, you can create a shared reference into that memory just fine. You have to a) use an interior mutability type so the rust virtual machine knows the memory could be modified by other code, and b) install a sigbus handler that swaps in valid pages in case somebody decides to delete the memory that your references point to.

Doing that is implementation-defined behavior (in this case, relying on Linux's signal handlers and paging implementation) and is not something even remotely memory safe- after all, you are effectively randomly rewriting an arbitrary portion of memory, and other processes could potentially do the same. However, in this case, it's possible to handle sections of memory you have references to getting randomly corrupted without UB if you plan for it. You can still reason about your code, even if chunks of your data get overwritten with random garbage.

Sure, maintaining memory safety might be the primary goal of avoiding UB, but it's not the only goal, and it isn't necessarily even a requirement in some cases.

I'm not convinced that this (the latter assertion) is the case. At least, not with Cell — the Rust VM is within its rights to assume that cell.get() == cell.get(), which may not be the case if that memory gets written to by another process.

Even with atomic accesses, I'd be wary. The OS will presumably ensure synchronization (to the point where that matters at the processer level), but you're essentially relying on the compiler to fail to prove that pointer provenance has not been leaked to the environment . Essentially, you need to tell the VM somehow that the environment could be writing to the memory region parallel to your own program.

The proper way to handle this is of course with volatile pointer read/writes.


In any case, yes, you cannot violate dynamic memory safety without causing UB. If this example does not cause UB, it is memory safe, because it's equivalent to the Rust VM semantics you've described, which includes at least one thread allowed to write to that region arbitrarily.

Memory safety refers to freedom from data races and use-after-free. It doesn't have anything to do with memory changing arbitrarily — that is just a symptom of violating memory safety, not the violation itself.

1 Like

While you description of the issue is really superb and clear it ignores the basic issue which made Rust popular:

Most laymans fear, despise and outright hate math — with passion.

When you start talking math their eyes glaze over and they find an excuse not to hear you.

This is the dirty little secret to Rust's success and also its biggest liability: while Rust, an ML-descendat successfully masqueraded itself as normal haphazardly-mix-everything-you-can-dream-about C++ wannabe… this also affected expectations for the documentation.

This Rust's masquerade is both its greatest asset (since laymans can learn Rust without fear if they don't know it uses many advanced math theories) and also it means that “it's how theory of compilers uses the term thus we are stuck with it” is not a valid excuse.

That's why it's not a good idea to say for better or for worse, Undefined Behavior is the term of art for mathematically undefined behavior which the compiler assumes does not happen; replacing that term would do more harm than good.

You are invoking the forbidden word! Shame on you! And if you say that “undefined behavior is precisely defined”… layman stops reading and asks for definition. Of behavior, of course, because what else can be defined in “something behavior”?

What we need to think about: whether is it possible to describe what you explained without invoking that forbidden "math" word. And, preferably, without unvoking “undefined behavior” combo. Combination of undefined in the mathematical sense is just too toxic to even be mentioned.

Rust's unsafe code is very good replacement for totally layman-unpenetrable TCB… can we do something like this to “undefined behavior”? I don't know what, but “behavior” have to go. There are no behavior to “defiine” thus… can we stop that nonsense?

From mathematical POV negative profit makes perfect sense, but laymans invent dfferent word for that. Same here. “Undefined behavior” just doesn't work. We need something else.

“Forbidden contion” or maybe “impossible situation” or something else. Just no “negative profits”, please. “Red ink” (or any combination of words which make absolutely no sense whatsoever) is acceptable.

We do have established terms which match more closely how your examples suggest you want to communicate the concept, but the problem is that there's actually a semantic difference here.

An unsafe API defines its preconditions; if you satisfy the preconditions, then calling the function is safe.

Some APIs will also define invariants. Preconditions must be true when calling a function; invariants must always be true.

Sometimes you'll see invariants used as a shortcut for both; similarly just safety requirements.

An API is sound if it cannot be used to cause UB, assuming all preconditions and invariants are upheld. An API is unsound if calling it can cause UB. (Yes, unsafe can also be unsound if its documented requirements are not sufficient to )

So you could replace UB with "violated requirements" or similar ... but this isn't communicating the correct thing. At the point where the requirements of an API have been violated, there isn't yet UB. UB is an operational concept (thus "behavior") that occurs when the Abstract Machine's requirements are violated.

And talking about the Abstract Machine is absolutely going to go over people's heads, and faster than trying to be precise about UB. You can at least partially concretize UB, but the AM is abstract by definition and doesn't "really" exist.

What we want to discuss really is the operational property. UB is caused by violated requirements, but the two are distinct. Also, it's already painful enough to try and draw the distinction between violating a library's requirements causing "library UB" and violating the AM's requirements causing "language UB". This is also an important distinction to make, and it's imho made significantly harder to discuss if we fail to separate how we discuss the condition violation from the resulting UB.

It's also just much more awkward to say "or otherwise arbitrary behavior, except that it will not violate any safety requirements" than just "or otherwise arbitrary defined behavior".

Yes, this is all horribly subtle. But there almost certainly aren't better words to describe these differences; if they existed, almost certainly someone would have discovered them in the multiple textbooks worth of discussion UB has generated, and if they were unambiguously better, would have caught on.

It's not impossible, but it's not for lack of searching that the terms exist as they do today.


The main issue with "trusted" versus "untrusted" is that in common usage, it's better to be trusted than to be untrusted, but w.r.t. TCB, it's better (all other things being equal) for code to be untrusted. unsafe gets that right, in that it's plainly better to be safe than to be unsafe, but it's still kind of backwards — the meaning of an unsafe block is actually to "handle the unsafe effect," or in human terms, you're staying that the contained code is actually safe to execute. (unsafe is absolutely the correct parity for function signatures, however.)

Undefined Behavior is as least a good a term in isolation as a meaningless word combination, because if you can treat "Red Ink" as an opaque term of art, you can treat "UB" as an opaque term of art. And it's for this reason I actually prefer leaving the term as just UB, not as an initialism, just UB, which doesn't stand for anything but UB.

4 Likes

I think the least problematic approach to teaching UB, certainly it's what made it click to me, is to say that "valid" Rust programs don't violate safety, and the compiler, etc., assumes you wrote a "valid" program, the problem is just to clarify that "valid" means is "part of the language", not "does the correct thing".

I don't think there's a problem there. Unsafe code is not safe to execute, because it can't be checked automatically. Hence, it must be verified by the human, who might make a mistake, hence it is unsafe. I.e., it is less safe (or more dangerous) than sticking with constructs that the compiler can automatically reason about.

I think you are mixing up "safe" with "sound"/"correct". Unsafe code can be sound (correct) if you got your analysis of invariants right. It can also be unsound (incorrect) with nobody telling you – hence, it's dangerous to write unsafe code, and it's not safe.

The thing with safe code is that it's always sound. And if it weren't, then the compiler would slap you on the wrist and won't let you do the incorrect thing. Hence, you are safe.

To sum up:

  • safe = guaranteed soundness, ie. safety
  • unsafe = soundness not guaranteed, code might be sound or unsound, hence you are in danger and you need to be careful.

The unsafe keyword just about perfectly communicates the essence of this.

2 Likes

The unsafe keyword is pretty good, but “undefined behaviour” is awful.

Because this almost automatically pushes developer in the wrong direction. Approximately:

  • Oh, it's behaviour, just it's not [yet] defined. Got it.
  • But hey, why can't we define it?
    It's obvious that at least in certain situations it can be defined like this: “…”.
  • Stupid compilers don't define it like this: “…”? Really?
    Is that a conspiracy or are compiler developers just stupid?

When we are dealing with C++ this is exacerbated by the fact that so many UBs there are just simply lazy: people were sure that certain behaviours are well-defined for decades till compilers haven't started breaking their old code.

We really need new term which doesn't include “behaviour” in its name. You may write as many such blog posts as you want, people just don't change their opinion so easily.

And automatic “understanding” that “undefined behaviour” == “behaviour which is not defined… yet” is really strong. I would say that 9 our of 10 (if not 10 out of 10) people who, eventually, “get it” pass that stage.

Can we try to skip it with better name?

2 Likes

I think Arbitrary Behaviour would be a better name. It would also work if we just said that the program is forbidden or invalid, but at this point there is too much legacy to scare people off with such simple words. Imho Rust's "violates memory safety" is good enough. It is essentially equivalent to UB, more readable, and less likely to be mistaken for something it isn't.

Wouldn't it be confusing for unreachable_unchecked to "violate memory safety", then?

Perhaps just "violation of the soundness rules"

2 Likes

It wouldn't change anything. If you would keep world “behaviour” it would continue to generate useless discussions like that one.

We need something which would trigger the proper associations: we are talking not about what code would be generated, but about limitations placed on the programmer.

Something like “forbidden code” or “safety violation in code” or something like that.

And it should start not with notion that we can not define all behaviours, but from cases where we don't define things that on purpose.

For example we may talk about something like simple enum:

    enum Foo {
        BAR = 0,
        BAZ = 1
    }

And about how our inability to reason about behaviour of the assembler code (Rust wants to have inline assembler, right?) makes it hard to deal with Option<Foo>.

We may discuss different options (generate panic, verify that value returned from assembler is correct, etc) and conclude that asking developer to just, you know, return valid value for Foo if you promised to return Foo — was the chosen solution. It's not too onerous from developer's POV, it makes life of the compiler easier and, more importantly, any other solution would be slower and would generate more code.

Establish that we are not talking about behaviours. Not at all. We are talking about rules developers have to follow to help the compiler. Just because asking them for cooperation was considered to be the best choice. Not because we couldn't define behaviour or because our small brains couldn't fathom how that behaviour may be defined. But because asking developers for cooperation is natural if we can not prove certain things.

Then we can expand it from there to the undefined values for things like MaybeUninit<&SqlConnection> and, eventually, to MaybeUninit<[u8; N]>.

At this point we can say that such code is called “code which triggers undefined behaviour” in C/C++ and some other languages, but that's secondary: we asked developers for the cooperation, we expect them to obey certain rules, thus, obviously, if these rules are violated the resulting code may behave erratically.

But the core is that request for cooperation. It doesn't sound obvious that developer have to do anything when you say “undefined behaviour” or “arbitrary behaviour”. It doesn't sound like request for help, but more like an intellectual challenge: oohh, you say it's “arbitrary behaviour”, but can you guess what behaviour may there be? I'm smart, I can, why can't you?

How does returning 2 instead of valid value for Foo “violates memory safety”? Maybe if we would remove “memory” from there, it would work…

Developers should ensure that their code never violates safety requirements or else compiler may miscompile it… sounds about right IMO. There are no “behaviour” anywhere which is good.

We don't want to scare anyone! That's the point! We want cooperation, not fear! Maybe can mention the tower of weakenings. But we shouldn't talk about possible behaviours, because it's not the point of that discussion.

I don't think so. If you program ever tried to execute unreachable_unchecked then it's instant violation of safety.

Thus compiler may use the fact that unreachable_unchecked would never be executed to remove code which handles complicated cases which wouldn't ever happen in real program because of it's structure.

For example in you do the following:

    if i > 4 {
        core::hint::unreachable_unchecked()
    }
    …

The compiler may remove both core::hint::unreachable_unchecked(), i > 4 check and also all the code which handled values of i larger than 4 below that check.

I'm specifically focusing on "memory". Point is, memory unsafety is not the only kind of unsafety programmer has to prevent.

1 Like

Oh, absolutely. I think that it's important to establish the fact that “safety rules” are there not to make language memory-safe but to protect developer from other errors.

E.g. normal, safe, Rust doesn't have NULL and that's immense help for the developers. It's one of the strongest points of Rust. And the fact that both &Foo and Option<&Foo> are both pointer-sized is important, too (otherwise people would always complain about memory waste). But that only works if no one tried to forcibly inject NULL into &Foo!

But in unsafe Rust it's trivial! &Foo and Option<&Foo> have the same size (we explicitly wanted it, remember) thus it's trivial to forcibly shove Option<&Foo> into &Foo and then… what should happen then? Should the compiler pepper the code with these checks which never supposed to be triggered, but still slow things down and take space? It's definitely possible, but is it feasible in a language tailored for efficiency? We already asked developer to add unsafe mark, maybe it's time to trust him (or her) a bit more?

But with great power comes the responsibility… developer is supposed to be careful when s/he is writing unsafe code.

Reaching unreachable_unchecked() leads to Undefined Behaviour, and if it's undefined, it may also violate memory safety. It's more of a "may violate" then "does violate", but that's not an important difference. After all, Undefined Behaviour also includes the possibility "your code works without bugs and is 100% secure because the compiler was very kind to you".

But then you get people who are annoyed that "obviously correct" things are forbidden, and think they know better than the compiler. Rules are guardrails for bad programmers, and all that.

1 Like

At this point you can safely say that they can propose change to the rule on IRLO and then rules may be changed. In fact if you can recall then it looks as if rules would be changed (relaxed a bit: creation of invalid reference may be allowed, although attempt to use it would still be considered “safety violation”).

But as long as rules exist they have to be followed.

It doesn't matter whether they think so or not. If we start with the explanation that rules are there not because compiler authors are stupid but because that's the best set of rules we can invent then the fact that “they know better” shouldn't matter. They can do whatever they want and they would be treated properly and adequately as explained in that article.

That's fine. Some people would always feel they are above the rules. But people always have a choice whether to accept such people or not.

I was really happy when I read that aforementioned article because Rust community have shown what it thinks about such people.

Yes, sometimes you have to reject really competent and bright people because they can not work together with others. That issue can not be fixed with change in the documentation, I'm afraid. We can only try to reduce the tension, we can not eliminate it.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.