Not unsafe operation in unsafe block

This seems demonstrably false? Here's a trivial example:

let length = 7;  // safe code
let v = Vec::new();
unsafe {
    // This is unsound, because of the safe code above
    v.set_len(length);
}

It feels like a lot of devil's advocacy is happening in this thread (I certainly contributed! :wink:)

But I think the message is getting through, I won't summarise it again for I'm sure several people will disagree on some minute detail and we'll be here for weeks :smiley:

2 Likes

Well, inside an unsafe block or not is irrelevant, but the safe code can matter anyway.

You might be interested in this classic post: The 'Tootsie Pop' model for unsafe code ยท baby steps

1 Like

What I think is agreed is that crates should be Sound, that is either it must be impossible to call crate functions to cause UB, or functions in a crate that can cause UB when called must be marked as unsafe, to alert the caller as to what conditions must be satisfied to avoid UB, and if these conditions are satisfied, UB will not occur.

The means by which soundness is obtained is not defined or agreed on. A crate may use unsafe code to improve performance, or because it is unavoidable (say when calling a library written in C) but it is a risky business, so crate authors (and consumers for that matter) need to check and test the crate carefully.

Ultimately it is a matter of judgement and policy as to how unsafe is used. It may depend on performance goals and the context in which a program is used. A game for entertainment, or a "competition" program (say a chess-playing program) can reasonably tolerate more risky unsafe code than a program that is operating heavy machinery, a nuclear power station or flying an aircraft.

4 Likes

This elides a really important detail; unsafe code should never rely on the correctness of arbitrary (e.g. user-supplied) safe code - it's perfectly fine for it to rely on the correctness of safe code, as long as the safe code in question is written by someone who knows what extra preconditions the unsafe code depends upon.

For example, std::vec::Vec has safe code that checks preconditions before entering an unsafe block; the unsafe code relies critically on that safe code being correct.

What the ecosystem is generally opposed to is simply documenting the constraints placed on safe code and requiring safe code to comply with those constraints or face unsoundness; but as long as the safe code in question is inside the same module boundary as the unsafe code, we generally accept unsafe code depending on the correctness safe code.

7 Likes

Note that, to some extent, this applies transitively to external safe code that is called from within the module boundary. I, for example, would have no issue with some unsafe code relying on bumpalo to manage its allocations properly or im::HashMap to properly deduplicate trusted keys.

1 Like

Here is a tricky one: suppose the soundness of my code relies on the correctness of a dependency which doesn't use any unsafe code.

Now if the dependency introduces a bug in safe code, my code will suddenly be unsound. How to square that with the idea that bugs in safe code aren't supposed to lead to undefined behavior?

1 Like

There seems to be some confusion between clients and dependencies here.

Unsafe code can have dependencies which it relies on to implement a sound crate, there is no problem there (provided the dependencies are carefully tested and checked to see they work as claimed, or possibly just trusted based on reputation or whatever). At a most basic level, almost all unsafe code will rely on arithmetic properties of the pre-defined system types like usize. So unsafe code necessarily relies on it's own dependencies, which can include system datatypes, the standard library, modules within the crate and potentially (but perhaps not commonly) other crates.

However a crate should not rely (for soundness) on what it's consumers (clients) do, at least not for an interface with safe functions ( for unsafe interface functions the clients must follow the safety requirements ).

1 Like

I would square that by saying that the bug is in the unsafe code assuming that a dependency is bug-free. As the author of unsafe code, it's on you to make sure that all the safe code you depend upon has no bugs that affect your unsafe code's soundness.

This then explains why unsafe code cannot assume that arbitrary safe code is correct - you cannot ensure that arbitrary safe code is correct, therefore you cannot meet your soundness requirement.

2 Likes

The problem with this however is that even if you verify the dependency is correct, there is no way to stop a bug in a minor version update, except maybe to use a pinned dependency.

1 Like

Or you accept "fault" (in as far as that's a reasonable concept in the ecosystem) for depending on a crate whose maintainer updated and broke your unsafe code.

Once you have a dependency graph, you're in some form of relationship with the maintainers of the code you depend upon, whether you like it or not. And if you're writing unsafe code, it's on you to make sure that the relationship you have with maintainers of crates you depend upon for soundness gives you the ability to ensure that they don't break the correctness guarantees you depend upon (e.g. by contributing test cases and helping with their release process).

2 Likes

Sure there is, when you run cargo update for a binary crate, you carefully check all the changes and run tests before publishing a new version of the binary crate.

Plus consumers of your binary crate need to use --locked when installing, or risk getting unchecked updates, that is on them really, what they want.

[ And ok, I know nobody hardly ever checks changes in dependencies, people mostly trust that updates to crates they use are correct, other than running tests ]

That shifts the responsibility away from the author of the unsafe code block (in a library crate) to the user of that code (in a binary crate) -- isn't the author supposed to be responsible?

2 Likes

I agree and unsafe abstractions should probably always be leaf nodes in the dependency tree, i.e., should not rely (for soundness) on dependencies other than the std lib. That may seem overly strict, but I don't see any other way that you can advertise soundness to your consumers.

1 Like

I think that's a reasonable rule of thumb, but not an "always"; if you have a suitable relationship with the maintainers of the things you depend upon, it's fine for an unsafe abstraction to depend on correctness of a safe dependency for soundness.

For example, if you maintain both crates, it's fine - you can be careful not to break correctness in ways that affect soundness of your other crate.

1 Like

I understand your point and it sounds reasonable. But this reliance on other crates would also make reviewing and verifying safety much more difficult, no matter how much the author trusts the dependencies. Unsafe is just so difficult to get right, we need restrictions to make that possible.

It can make it easier as well as harder to determine that a particular unsafe abstraction is sound.

If everyone implements their own bump allocator, and you use two unsafe abstractions that depend on a correct bump allocator, then you have 4 things to review - the unsafe abstractions you want to use, plus two implementations of a bump allocator. If those two abstractions both depend on bumpalo, then you only have 3 things to review - you can reuse your review of bumpalo.

And this is where tools like cargo-crev (for general reviewing) and cargo-vet (for reviews within your project scope) are useful; if you can see via cargo-crev that people have concerns about the correctness of a crate that an unsafe abstraction depends upon, you know to be cautious. Equally, if you're using cargo-vet to record local audits, then you'll be able to record "bumpalo 3.16.0 is audited for correctness", then later not re-audit it.

1 Like

I think in practical terms it is somewhat unusual for unsafe code to depend for soundness on another crate.

I can just about imagine it in my own code though, if I was using an external parsing crate and also "maximal" use of unsafe, which really equates to turning all asserts into unreachable_unchecked using a macro and a feature gate.

This is "go as fast as possible I don't care if it is risky" mode, suitable for gaming or development non-production situations where speed is more important than safety, or I suppose where the base crate is highly trusted.

I guess that also highlights the benefit of an unsafe block. It narrows down where testing and auditing is needed.

If I have coverage of the unsafe code, and debug assertions of the specific invariants the unsafe code requires to be sound.

I can test with tools like Miri and fuzzing to be reasonably confident that the variants hold before and after updating a dependency (and my own internal changes).

I think having that kind of testing is a good idea, especially if you have written your own unsafe code.

If I have coverage of the unsafe code

Having said that... If the safe code that triggers UB isn't covered by tests, then you might never trigger UB or any of your debug assertions and have unsound code despite all your testing. I guess fuzzing your unsafe code helps here.

1 Like

Thanks, I wasn't aware of that! It's a very worthwhile effort and is probably the best solution to this issue.

1 Like

Um what? That doesn't demonstrate anything contrary to my claims. If anything, it proves my point.

That unsafe code is relying on a value that is incorrect, and that incorrect value comes from safe code. You can cause UB by merely setting that length variable to a non-zero value. Since setting a variable to an arbitrary value is a perfectly reasonable thing to do, no unsafe code should assume that a Vec has the same length as a completely independent integer, so the assumption of the unsafe block is unjustified, and it's plainly wrong.

1 Like