Unsafe Rust: Intro and Open Questions

Hey all! I've written up some... man I don't even know what to call it at this point... stuff! I've written some stuff on Unsafe Rust. Where we're at, where we're going.

http://cglab.ca/~abeinges/blah/rust-unsafe-intro/

1 Like

Very cool. Thanks for writing this up.

Here are a few assorted opinions and observations on unsafe in Rust. Hope they're helpful.

  • We should pay attention to the converse message of unsafe. What is the implication of code not marked as unsafe? If we vaguely define unsafe to mean **mumble mumble** contracts **mumble mumble**, then we might invite developer complacency in safe code. It's like if I ate 10lbs of Skittles, got violently ill, and then sued the Food and Drug Administration for approving it as safe for consumption.

  • Bad guys hunt for undefined behavior and try to exploit it. Reducing the amount of code that can create undefined behavior is great for quality assurance, but it's absolutely critical for security. I know this is a little more of an adversarial model than most developers care to consider, but it's a really big deal and it's only going to get bigger. As an example, I wonder how many Heartbleed-like vulnerabilities are found in here.

  • As you say, "It should be impossible to invoke Undefined Behaviour using only safe code." That means that the compiler enforces that safety. When you step into the world of unsafe code, it's the programmer ensuring it's safe. There's no difference between C and unsafe Rust except that unsafe Rust is fewer characters of source code from safe Rust. That's a sobering realization.

  • Haskell marks all it's "unsafe" code inside the IO monad, because IO is where functional languages need to serialize computation. More generally, it's saying that unsafe code is the interface on the boundaries of the runtime and therefore must be treated carefully. Maybe there's some insight there to absorb?

  • Also, Haskell marks entire packages as "safe" and "unsafe". I'm only aware of this marginally, but that doesn't keep me from extrapolating on the idea! Maybe we could mark crates as safe or unsafe depending on if there's unsafe code in there. Linkage optimization might be better between two packages that are marked safe than two packges that are not. That would give a disincentive for using unsafe code. Just spit-balling here.

Rust is important to me because I work in security and information assurance. I see strong type systems and safe (i.e. no undefined behavior) languages as one potential solution to some difficult problems in security. The type system can be used, for example, to "tag" unsafe input into a system, much like Option<T> does for potential errors, forcing developers to at least explicitly ignore the problem.I see Rust as The Future (tm) and I want to see it work out. I'd hate for it to fall into some (unfortunately well trod) tar pits.

Seconding the thanks for the write-up.

It's probably not worth discussing, but a lot of us Rust newbies were lured from the land of C/C++. Among many other things, the glossy brochure promised "safe" programming directly at bare metal speeds. I really had the impression that there were only a few reasons to use unsafe anything:

  1. interfacing to a C library
  2. unsafe_get() for optimisation
  3. fundamentals in the std/core library

In other words, don't use unsafe stuff unless there is no other option, and you're still a dirty programmer if you do.

Apparently the truth is a lot more casual - feel free to use unsafe when it's helpful. In retrospect, that's just fine, but it wasn't the impression I got from the reddit zealots and marketing people...

2 Likes

reddit zealot here speaking (just kidding): No matter what you do, you should write safe code first. Only once you a) run into a problem that cannot be solved with safe code (because you need references that the borrow checker would not allow) or b) have profiled your code and determined a hotspot that needs unsafe code to optimize (e.g. to omit bounds checks), reach into your bag of unsafes and grab one. Use sparingly, like strong peppers.

You may still call yourself a dirty programmer if that floats your boat, but it has nothing to do with your code. :smile:

3 Likes

I'm well aware of the adage about premature optimization... People have been trotting that cliche out for years to make themselves sounds wise.

What I was trying to get at was how extreme should using "unsafe" be considered. Rust is a new language, so I don't have a rule of thumb just yet. On the one hand, it could be as generally unnecessary as writing asm blocks in C++ --- you really really don't need assembly for 99.999% of applications, but of course there is some in the standard system libraries. On the other hand, "unsafe" could be as common as implementing a C extension module for Python - it's extra work, and you have to be careful, but it isn't really frowned upon.

Anyways.

I used to do Comp. Sci. tutoring, where if you used a global or goto in your code (at least in the first two years), you'd lose marks, no ifs ands or buts. The simple reason was that 99% of the time, people used them in ways that dramatically and negatively affected their code. They weren't necessarily stupid, they just severely over-estimated their understanding of the drawbacks.

No matter how carefully we explained the pros and cons, no matter how many times we put warnings up in big, red, bold letters, people would just immediately use them the moment doing so would marginally simplify their immediate problem, consequences be damned.

The fact is, people will, overwhelmingly, gladly strip down any nuanced, careful explanation of a potentially dangerous tool to a simple binary answer to the question "will it cause my head to fall off?" If it is not instantly and always life-threatening, then it must be good!

As such, I strongly feel that the correct default thing to do is to massively inflate how dangerous something is. That way, they either avoid it, or approach further investigation from the correct direction (i.e. I shouldn't do this, but maybe there are cases where it's OK...).

I also used to teach CS to first-semester students (as tutor and research/teaching associate). And yes, the guidance those people needed was certainly more strict than the average reddit zealot would imagine (does that make me an above or below-average reddit zealot?).

That said, Rust is not exactly what I would teach to new students, because it's a) quite far from the mainstream (and since about half of those folks have quit after half a year, it's better to fill their heads with stuff they can immediately use in their future occupations, like python or shudder Java).

That said, I think it is sufficiently clear that unsafe should not be used needlessly – which is in line with other languages provising unsafe features (e.g. Java's sun.misc.Unsafe). The docs currently say:

I’ll repeat again: even though you can do arbitrary things in unsafe blocks and functions doesn’t mean you should. The compiler will act as though you’re upholding its invariants, so be careful!

Maybe there should be a more prominent section like (@steveklabnik: feel free to use for TRPL):

Why would I use unsafe?

There are two situations that call for unsafe code:

  • You need to override the safety rules (and guarantee that your code is sane even in the face of the relaxed safety rules). In those cases, you really cannot do what you need to because a safety check got in the way. In this case use unsafe and program as if the next person who uses the code is a violent psychopath who knows where you live.
  • You cannot achieve the required performance within safe code. There are times when you want to eke out that last bit of performance, but cannot within the cozy confines of safe Rust. In those cases, you should first write the slow & safe version, and measure its performance. The profile will guide you to the parts of code that need unsafe features. Also you can use the slow version as a baseline against which to test.

In both cases, there may be libraries that supply safe interfaces over unsafe features that you may be able to use without writing unsafe code yourself. So when you find yourself in one of the above situations, it may be useful to look around if someone else has already come up with a solution you can reuse.

1 Like

I'm learning about Rust so pardon my limited knowledge in this topic. Commenting on the two situations that may justify using unsafe code, I'm curious if there really is a case that there is no way on earth to continue with the code unless there has to be an unsafe block somewhere. Wouldn't this violate the basic goals for designing a language where undefined behavior should not be expected?

On the contrary: The possibility to design safe interfaces around unsafe code is what makes Rust possible at all. Note that unsafe is not unsafe – it's just giving the compiler a break and upholding the guarantees manually within a block.

E.g. it is impossible to implement Vec without unsafe code (unless perhaps you allow for abysmal performance), but safe code can use Vec as a safe abstraction over the unsafety of having some uninitialized memory (that is never read from anyhow).

Writing safe code around an abstraction that has unsafe operations is possibly only possible if that abstraction itself prevents all the unsafe operations from harming other parts of the program. I mean there has to be a layer of sandboxing or some similar technique to make safe use of underlying unsafe code.

Exactly. For example, as stated above, Vec uses uninitialized memory, which itself is unsafe, but by keeping track of len(), it can keep the unsafety in check. Likewise, Rc and Arc work around the static guarantees, and keep them by moving the static checks into runtime.

1 Like