Conveniently giving up on safety

I'm wrapping up a C++ library, mostly graphics. Thousands of functions, and of course, they're all "unsafe."

I could easily bracket all that with unsafe { ... }, but that seems to imply that I have guaranteed its safety, and that would be a little beyond the reality. So right now it's all nakedly unsafe, other than some stuff inside macros where it would be real awkward for the user to annotate the items that would be unsafe.

As it's a graphics library that works through virtual function callbacks, this means often the entire program will be "unsafe." I don't think anyone will be worried about that, but in my casual reading on this, I see the term "Unsafe Rust" here and there, and I'm thinking - that's for me! Is there a way to simply assume the entire program is unsafe, which would simplify life for the programs using this library?

"unsafe rust" is just Rust inside unsafe, sorry.

It's pretty normal for a binding library in particular to just declare all your methods unsafe, but ideally you can at least list the safety requirements, and correctly declare (or not) marker traits like Send and Sync on the types.

Correctly wrapping a binding library for safety is often a library in itself, as there's often performance vs ergonomic tradeoffs being made.

1 Like

Thanks, that's about what I expected. When you say ".. just declare all your methods unsafe" ... of course, I'm obliged to do that (pub unsafe fn), because each one calls an external function ...?

Thread safety per se is one thing I'm not very concerned about - there will be a whole lot of conccurrency, but that's because the library itself uses POSIX type threads extensively. So API and standard practice is built around that and isn't casually littered with concurrency pitfalls. Could be in for a ghastly surprise, but the first program using parts of the library ran concurrent threads with no sign of trouble.

Philosophically I hope there is never a way to make unsafe convenient in Rust.

I have to ask: If you are happy for your entire program being unsafe and a large part of it is C++, why use Rust at all?

3 Likes

You actually can declare imports as safe now IIRC, but binding libraries do tend to do a tiny bit of wrapping to match syntax, eg turning a pointer/length pair into a slice. It's fine to also leave those wrappers as unsafe if they have additional safety requirements, is what I mean.

1 Like

There isn't really any third choice. Graphics will go through C++ virtual function callbacks. You can use Unsafe Rust, or you can use unsafe C++. C++ would certainly be easier in some respects, but as I've been using Ocaml for a while ... Rust is eappealingly similar and doesn't have Ocaml's runtime locking out concurrent access.

What would you do - use C++ because Rust is for safe code only?

Maybe I agree. The coding I do of late is to automatically generate the interface wrappers, and I've started that with a couple different l anyanguages - and the one I gave up on the fastest was Rust, by a large margin. It's so awful for text processing. I'm sort of optimistic that I can manage to get some mileage out of Rust's features and it will be better in the end than if I'd been writing C++, but maybe you're right, neither I nor anyone else will end up having any use for the language here.

If the library is responsible for thread safety, and doesn't have a lot of known bugs in this regard, then IMO the Rust API doesn't need to be marked unsafe for thread safety reasons. It would be nice to document this however, so the user is aware of the dependency on the C++ library for thread safety.

Are there other things about the C++ interface that is unsafe, for example, is it possible to get use-after-free or access past the end of a buffer? If so, the raw Rust wrappers should be marked unsafe. Alternatively, the wrappers could do the bounds checking and whatever else is needed to guarantee memory safety, and then the unsafe marker is not needed.

Well ... when I said thousands of functions, I wasn't exaggerating. If "use-after-free" is just what it sounds like, for example I happen to know of some functions that "take ownership" of a data object passed to them by pointer, and the caller had better not free that object. That's relatively rare, but pointers are everywhere.

When the caller takes ownership, I think it's important to wrap the pointer in a Rust type that will free the pointer when it is dropped. Not only is this needed for safely, without it you will need to carefully document all the entry points indicating whether the caller is responsible for freeing the pointer.

If the pointer represents an owned object, a "safe" wrapper library would wrap it in a Rust type that calls the correct free function on Drop. When you pass it to the Rust wrapper for an external function that takes ownership (doing away with the obligation of the caller to free the object and indeed prohibiting it), you pass it as an owned argument to the wrapper function. The wrapper function destructs the argument (ensuring the object will NOT be freed later) and passes to pointer to the C++ function that assumes the responsibility to free the object later. This makes it impossible to violate the API invariant in "safe" Rust.

Rust is not for safe code only. Unsafe code is necessary (often under the hood) for almost any task you might want to accomplish. However, one of the main benefits of Rust over C++ is the ability to limit the extent of unsafe code by providing safe wrappers that guarantee -- either statically through the type-system or using runtime checks -- that the conditions assumed by the API are actually met, allowing the compiler to help prevent bugs.

4 Likes

You may be interested in the Vulkan API crates for a comparison; they have an unsafe, fully general but ergonomic binding crate ash and a rich, safe but (slightly) less general and performant wrapper crate vulkano (among others)

Cleanly wrapping a C API and safely wrapping an unsafe API are different problems with different solutions.

9 Likes

In the present case, all such objects will have been allocated in C++. In principle, maybe I could allocate in Rust and pass the memory to the C++ constructor, but that doesn't seem like a very good idea at all.

if the majority of the code base has to be on the C++ side, maybe an alternative solution is to integrate rust code as components or subsystems into the C++ application.

as far as I know, many popular (mainly C/C++ focused) build systems such as cmake, meson, bazel, all support rust lanuage pretty well. it doesn't make sense to replace cargo by them for "pure"-rust (or "mostly"-rust) projects, but they do work well for mixed language projects.

Well, sure, I'm already not using cargo.

It isn't so much that the majority of the code is C++, it's that this is the supervisory side if you like, because it's the user interface. So all the real Rust parts are called through unsafe wrappers, and the "unsafe block" concept seems kind of inside out and not very helpful.

To be honest, I'm doing this partly because "they said it couldn't be done." The sane way to write applications in this environment, is in C++, for sure. But I don't care much for C++, and if you want full OOP in Rust with user defined callback virtual functions ... OK, we can do that. I'm retired, I don't care if it turns out no one likes the looks of it.

The following answer is firmly in the realm of "opinion" and will invite substantial bikeshedding. It's not really an answer but more of a rant, really. Nevertheless it's something that's been bugging me for a long time and I kind of want to get it out of my chest. Sorry.

I think it's really unfortunate that Rust has mandatory unsafe blocks.

It's incredible that the language designers constrained all possible UB to five operations: dereferencing raw pointers, accessing raw unions, mutating static (global) variables, touching FFI symbols, and of course calling unsafe-qualified functions. If you write code that does not use any of those capabilities (Safe Rust), you've provably eliminated an entire class of errors in your code.

But if you do have to use those capabilities, you essentially have to say please to the compiler before it'll let you do it, every time. Literally! unsafe blocks do not change the semantics of operations in them; unsafe blocks do not, as some people think, "delete run-time bounds checking" or "turn off the borrow checker for this chunk of code".

You could wrap the body of every function in every file of your project in unsafe and the compiler would emit identical machine code, because unsafe blocks add no meaning to your code. They're just a magic word that you have to say before doing stuff that is otherwise built into the language. And that's kind of sad.

The people who use Rust for systems-level tasks, where those low-level features make up a significant percentage of LoC, have felt this pain. Some have complained. The pain is real, and the language's ergonomics do suffer.

And for what? As I said in the beginning, the Unsafe subset of Rust is very clearly delimited. Which makes it very easy for programmer tools to prominently highlight Unsafe operations in your code, as every IDE I've tried handsomely does. They don't need the unsafe block for context, it is possible to statically know from the types of the variables involved or the signatures of the definitions being called what ops are unsafe.

Even with plain-text no-highlight editors, it is already a widespread practice to annotate relevant sites with // SAFETY: <contract description> comments. This is a thousand times better: those comments include an actual human-readable reasoning and they do not bloody shift your code an indent level to the right and add 2-3 lines of meaningless vertical space.

I wish it were a matter of code style, configurable according to every shop's coding guidelines. We're sadly getting even further in the opposite direction: unsafe_op_in_unsafe_fn was recently introduced, with a plan to make it an error-by-default in future editions. Even the proponents of the feature acknowledge that some people just throw their hands up and write fn foobar() { unsafe { ... }} to avoid having to deal with this feature of the language.

As a counterpart, I'd love to see an #![allow(implicit_unsafe_use)], instead.

And as a closing note, a paragraph from the Wikipedia article of INTERCAL which I'm always reminded of when writing unsafe blocks...

INTERCAL has many other features designed to make it even more aesthetically unpleasing to the programmer: it uses statements such as "READ OUT", "IGNORE", "FORGET", and modifiers such as "PLEASE". This last keyword provides two reasons for the program's rejection by the compiler: if "PLEASE" does not appear often enough, the program is considered insufficiently polite, and the error message says this; if it appears too often, the program could be rejected as excessively polite. Although this feature existed in the original INTERCAL compiler, it was undocumented.

1 Like
  1. One does not have to use IDE. Even in system-level programming. Especially in system-level programming, where, even in a project like Linux core, there might well be patches sent via email.
  2. Could you share an example? I didn't have a very wide experience with Rust IDEs (I essentially only used VS Code + Rust-analyzer), but I've never seen them highlighting these specific operations.

One does not have to use IDE. Even in system-level programming. Especially in system-level programming, where, even in a project like Linux core, there might well be patches sent via email.

Agreed. IDEs aren't and shouldn't be required (or even assumed) for programming, but in practice, they often are. They're just nice, most of the time. It's fine to not use one, though.

Even if you're not using an environment that visually highlights code, alternatives such as the // SAFETY: <...> comment convention are still better than unsafe-blocks, as per the rationale in my OP. That's just my view on it though.

As an aside, whether patches are managed via e-mail or a forge or whatever other method is orthogonal to using an IDE. At $dayjob we used to manage patches via e-mail but I've used an IDE since day one.

Could you share an example? I didn't have a very wide experience with Rust IDEs (I essentially only used VS Code + Rust-analyzer), but I've never seen them highlighting these specific operations.

RustRover (from IntelliJ). Their Rust plugin for CLion (which is awesome and I used for years) did the same. Image for reference.

1 Like

This proposal would solve this particular issue.

2 Likes

Is this really true? To check I looked at the Embassy OS for the STM32 MCU, about as "systems level" as one can get:

$ cargo loc
...
Breakdown of the total lines by language:
Rust: 7836593
Alex: 106519
Markdown: 17902
TOML: 9654
CSS: 737
F*: 445
Shell: 418
Autoconf: 279
ReStructuredText: 66
YAML: 44
BASH: 42
Plain Text: 33
Batch: 23
C: 10
SVG: 8

Total lines: 7972773
(7876529 code, 35896 comments, 60348 blank lines)
grep -ir unsafe * | wc -l
     529

That seems like a very small proportion of "unsafe" code. Is it really so much pain?

2 Likes

Part of the power of the unsafe block/keyword is that it's a compiler enforced syntax. This alone makes them much more obvious and easy to search for without semantic understanding. Relying on having to remember to annotate the code with optional comments or using a special tool sacrifices this clarity. You end up having to look up function definitions instead. The benefits are similar to explicit references, in the sense that you can immediately see more of what's going on without a lot of context.

5 Likes