You might be aware of this, but to other readers, a quick clarification on this point.
As an aside, writing this motivated me to look into the story of data races in so-called “memory safe” languages that don’t have UB in the first place. The answer commonly tends to be, roughly paraphrased: “the behavior is defined, but so weakly defined that you should absolutely avoid data races anyways”, at least for Java, AFAICT. Without having tried to understand their memory in any detail, I would imagine that data races make it really easy for your code to generate objects that are in a horrendously inconsistent state; but since the language manages all memory through the same garbage-collected heap, no matter how horrendously broken your data is, you still won’t have double-frees or use-after-frees. Of course, such an approach of making data-races no-longer-UB could never work in Rust: There’s no way to correctly track ownership in a data race, and since in Rust ownership==memory-management, correctly tracking ownership is a fundamental necessity.
I think, the only correct answer to this is: you are comparing terms that cannot be compared in this manner in the first place. “Undefined behavior” is a possible program run-time behavior, or behavior of a function when executed with a specific input. On the other hand “unsound” is a property applicable to some whole API, or some programming language feature, or maybe to the entirety of safe Rust, or any combination of these. Well… so maybe in the special case of “soundness” of a single function without any input or access to global state, this boils down to a single question about “undefined behavior”, so that in this case “has undefined behavior” and “is unsound” are the same.
Click for more 🙂
Undefined Behavior
If a particular program is run with a particular set of inputs (and a particular set of choices for any nondeterminism/“randomness” that comes up during execution), then this program run can behave as “undefined behavior”.
The term “unsound” can be used as soon as we start to generalize the things we are talking about. Or… well… arguably…
…as a first step, we could consider a single program with a single input, but all possible choices of nondeterministic/random behavior during execution. If any such choice of nondeterminism leads to “undefined behavior”, arguably, the correct term to use is still “undefined behavior”. The point being that the behavior of this program on this input is undefined, since there’s no way to avoid the possibility of undefined behavior, and any behavior where UB is one possible outcome, is UB itself. In my opinion, it still makes sense to distinguish these cases somewhat, if for nothing other than describing the usefulness and limitations of miri, which can, as far as I understand, generalize over certain kinds of nondeterminism/randomness, but of course not over all of them. If you write a function that generates a random u32
number and triggers UB only if the number is 42, realistically, it’s not going to catch that case. Similarly, if you trigger UB only when certain allocations happen to be randomly more aligned or less aligner in certain ways.
As a second step, we could also consider more factors that might be beyond control of the program, so you could still speak of “undefined behavior”. E.g. the version of rustc
: Code might violate safety requirements of the Rust standard library, in ways that lead to actual UB only in future versions of Rust. This is sometimes called “library UB” and contrasted with the term “language UB” for the “actual UB” I was talking about in the previous sentence.
Now going away from “undefined behavior” terminology. If we consider still only a single program, but all possible inputs, it’s reasonable to ask “does the program avoid UB for all possible inputs?” I’m not actually sure what the correct terminology for this concept is. Maybe “sound”? Or “memory-safe”? Just “safe”? As in “the program is safe” / “… sound” / “… memory-safe”. Here, “safe” sounds a bit like it’s referring to “safe vs unsafe Rust code”. We can avoid this terminology problem by ignoring programs with input. Every input could possibly be hard-coded anyways, once we generalize over safe programs, so let’s only focus on programs without any input. Such programs would still simply have either “undefined behavior” or not.
So, as discussed above, “(has) undefined behavior” is a property of a program, possibly with some specific input. Similarly, in order to determine where this undefined behavior comes from, we can talk about individual functions. The principle is the same. A concrete function for a concrete set of inputs (and while observing some concrete global state) can trigger undefined behavior, so that a program [for simplicity, program with no input] that executes that function with those inputs in that global state will have undefined behavior.
Unsoundness
As mentioned above, (un)sound is mostly a property of whole APIs (however small or large); which could be a single function, but usually involves many ways of calling such a function or API, with various inputs, multiple times, in various orders, doing various other operations in-between.
Let’s quote the UGC’s definition
we say that a library (or an individual function) is sound if it is impossible for safe code to cause Undefined Behavior using its public API
This means, that for soundness, we not only need to understand what “undefined behavior” is, which was discussed above, but also what an API is, and what “safe code” is. The TL;DR of course simply is that “sound” is like a universally-quantified “has no UB”: Similarly to the discussion above where we considered all possible inputs to a program, we can consider all possible usages of a function or an API, and when there’s some possibility to create UB this way, it’s called unsound. For a single function, that’s essentially “consider all possible inputs and all possible (relevant) global state(s)”.
But the “all possible usages of a function” involves the concept of “safe code” in Rust. Since, due to language bugs, it has never actually been true that it’s impossible to cause Undefined Behavior using safe [Rust] code in the first place, the above definition cannot be taken literally in a mathematical sense, otherwise, all APIs would be trivially unsound. Maybe an intuitive fix to this problem could be by instead requiring that “the API will most likely be sound once all language-level soundness-issues are fixed”. We need to weaken the statement to “most likely”, because we cannot actually know, how Rust’s soundness issues are going to be fixed – we can only guess, and (reasonably) hope we didn’t rely on any assumptions that those fixes will break.
Also “safe code” includes usage of the standard library, typically, which contains “safe” abstractions over unsafe code used internally. But it might include even more. Perhaps we also consider all sound Rust functions, potentially using unsafe
internally, as “safe” Rust? No that won’t work; the definition would become cyclical, we’d be asked to already know what sound Rust code is in order to define what “soundness” means in the first place. But there’s also a problem if we don’t consider any third-party sound Rust functions that might use unsafe
internally. The problem is that if we don’t consider them, then two APIs that are both individually sound by definition, could become unsound when used together.
One example for this problem is the API offered by the crate replace_with
. The basic idea is to allow application of a fn(T) -> T
on a &mut T
reference, while correctly handling the case that the function might panic (by various strategies such as writing back a default/fallback value or aborting the program.
This API could be considered sound because it cannot cause UB using safe code. But then, imagine another API which has one type MyType
and two functions fn provide_ref(callback: fn(&mut MyType))
and fn must_not_be_called(MyType)
, where must_not_be_called
causes UB when called. This API, too, could be considered sound, since there is no safe Rust code that allows you to obtain an owned value MyType
by calling provide_ref
. (Assume MyType
has only private fields, no constructors, really there’s no further API here at all.)
But together, these APIs suddenly become unsound. You can call the replace_with
API inside of the callback passed to provide_ref
to obtain an owned MyType
value after all; then call must_not_be_called
and
.
The right interpretation in my opinion is that – somehow – the abovementioned APIs should both be deemed unsound. Until some official source (or some large consensus) decides (more or less arbitrarily) to define (at most) one of them to be sound. Which in this case, as far as I’m aware, has happened in some form, with people generally being of the opinion that replace_with
is sound, and consequently, the other API definitely isn’t.
I have only now, writing this answer, thought about that there might be some difficulty in formally defining the concept of “soundness” in a way that does have the effect that – as intended – the two APIs described above would initially both be considered unsound. Maybe someone else has ideas how that could be accomplished, mathematically? Anyways… ignoring this problem, a precise definition of “soundness” should then presumably also mention a set of defined to be sound API patterns, which includes the standard library, but possible even also additional, externally, things, such as replace_with
.