How (not) to use unsafe code

An alternative could be to mandate a doc-comment above unsafe blocks, though that would require defining what putting a doc-comment on a code block means.

1 Like

Some canonicalization around explaining unsafety internal to a crate wouldn't hurt, but that's somewhat orthogonal. It won't stop anyone, of course.

I'm personally most interested in why actix-web chose to use some much unsafety in the first place, seemingly recklessly in places, including disabling existing lints along the way.

It would be nice for crates.io (and maybe cargo integration) to somehow show an "unsafety barometer" for a crate so someone considering using it has an inkling as to what's in there. That unsafety may be documented (or not), but that's a secondary issue. If I, e.g., see an http lib rife with unsafe use, it's going to raise questions. If, on the other hand, I see something that sounds like it should have unsafe code, then it'll make more sense (no guarantees it's not reckless, but that's a separate issue).

5 Likes

Keep in mind that unsafety of unsafe blocks is not equal, and not even proportional to their length.

I use unsafe mainly to to call FFI functions. It would be painful if every FFI call needed even more ceremony.

Actix seems to have intentionally used unsafe to shut the compiler up, even disabled clippy warnings for some code. So I think just more compiler warnings wouldn't help, they'd just work around/silence more roadblocks.

7 Likes

There has been previous discussion about marking crates as "unsafe". I think this is a delicate issue that can't be solved with something simple like counting lines or occurrences. It would need at least a static analyzer to distinguish between good and reckless uses of unsafe.

It could backfire if people felt they had to write pure safe Rust code or have their crate marked as bad.

2 Likes

I don't think we can ever really know, but one of the things that makes UB so insidious is that even if the behavior of your program is undefined, it can still actually appear to work exactly like you expect it to. From my own personal experience, I get the feeling that for some people, this is, in and of itself, enough, regardless of how many roadblocks we put up.

In some cases (maybe for actix-web, but I don't know), I suspect the only effective road block would be adoption itself. If abuse of unsafe is noticed immediately, then it's possible that adoption/attention might have never materialized because of that. While abuse of unsafe is unfortunate, what makes it truly bad is when it actually starts getting heavily used by others.

With respect to surfacing the visibility of unsafe, I don't know what the right answer there is. It could be anything from automated tools that report metrics to just developing a stronger culture around auditing code manually, or perhaps something in-between.

10 Likes

We can probably get fairly close because, AFAIK, all the actix projects are essentially one author, and he frequents this forum as well :slight_smile:. I'm hesitating in cc'ing him because I'd rather this be an educational conversation, and not a unsafety witchhunt.

This instance is particularly interesting because actix is a non trivial project, and has had success in gaining users (with overwhelmingly raving reviews from their standpoint, from what I've seen). It's done well in performance benchmarks. The person(s) behind it clearly know what they're doing (leaving the unsafe stuff aside). And yet there's rampant unsafety.

Why was it difficult (or impossible? or what?) to write the same functionality in safe Rust? Is it a performance issue? Is it too difficult to design a complex app with constantly keeping borrowing in mind? Were these just initial shortcuts that were never removed later? Something else?

I think (a) figuring out why that is and (b) possibly addressing it is just as important for moving Rust forward as catering to Rust newcomers and other initiatives that, to one degree or another, have been questioned recently.

I don't know either, although I think we can all agree with some elements of what it would look like (implementation difficulties aside).

8 Likes

This seems like it could naturally pair with #[safe(...)] (or some approximation) and audit tools. That could go some ways to addressing @kornel's concerns:

2 Likes

What if author created unsafe!() macro that auto inserts #[safe(shut up)]? Or just put a message like "because I need a mutable pointer here".

In actix case the author intentionally silenced clippy warnings, so probably the author strongly believes they're not correct, not relevant or there's no other way. So it'd expect them to do the same for "unhelpful" #safe() annotation.

4 Likes

Could a proposal similar to the following (previously posted to another thread in internals) help in this regard: [Pre-RFC] Another take at clarifying `unsafe` semantics - #31 by gbutler - Rust Internals

1 Like

I think if I were to read/see such in the #[safe(...)] I would be reasonable in assuming that the author cared not about the safety or correctness of their code and that I could categorically ignore using the Crate on the grounds that the author has demonstrated a lack of care regarding safety.

2 Likes

The purpose of #[safe(...)], in such a regime, would be to key into a formal audit trail. Alone, it wouldn't do anything beyond document the author's intent. Its power is in combination with a formal review.

Extending @IsaacWoods' example:

#[safe(reason = "Bit layout is the same, so safe to transmute", review="5550")]
unsafe {
    let my_number = mem::transmute::<[u8; 4]; u32>(my_array);
}

review="5550" would key the unsafe block to a review system. The specific semantics are debatable--is this a GitHub issue, a code-reviewed pull request, a signed commit, a crates.io report, etc--but the idea is to pair unsafe statistics with an audit trail.

5 Likes

All of you make great points. While a safe marker certainly wouldn't prevent reckless unsafe usage, it would help auditors figure if it was warranted.

I also think #[safe(..)] - while succinct, would encourage end users to write incredibly short reasons (which could backfire and make unsafe usage rather confusing). Perhaps a special marker or keyword can be introduced into the markdown documentation? Or perhaps marking something that is #[safe] would require some form of documentation above it?

BatmanAoD also makes some great points against this below.

1 Like

I don't really see the value of this, and I think that if the community ever comes up with a "good" solution, legacy #[safe(.... lines would just be noise.

This is important, I think, because I do believe that it is possible to find a good solution. In particular, I think that any robust solution would have some kind of precondition/postcondition logic. This is quite complicated; in particular, it would require:

  • Some kind of formal language for preconditions and postconditions
  • Runtime and/or compile-time checking of conditions
  • Primitives for interfacing with code and/or compiler internals (e.g. IsaacWoods' "Bit layout is the same" condition can't be checked within the language itself)

I expect that the RustBelt project will necessarily develop something like this as a compiler extension, at which point the Rust team can start to consider adopting parts of their language into the core language.

In short, I believe that there is a path forward on unsafe annotations that we could eventually use for linting purposes, but it is a long and arduous path, and if we take a shortcut like #[safe(reason = "some arbitrary block of text")], I'm not sure what we would be gaining, and I think we would actually be adding noise.

2 Likes

The review= syntax might actually be okay. I would actually eliminate the reason= bit; I don't think there's any reason to make it "appear" that the compiler is actually going to be doing something interesting with the annotation when in fact there's no realistic way for it to do so.

Are you suggesting that calls to FFI should not have justifying comments explaining why the call is in fact safe? I would think that would be good to have. For example, if you are calling an FFI function that takes 2 pointers to some memory and you are turning an immutable reference to a boxed array into those pointers before passing them to the FF, is that "Safe"? Does it result in UB? Shouldn't you justify why it is OK? For example, shouldn't you have a comment along the lines of, "OK to just pass a pointer to an immutable array because this FF does not attempt to deallocate the pointers or to write/modify the memory they point to. This FF's contract is that it will only read the contents of the memory the pointers point to." (perhaps simplified, but, something like that).

Wouldn't having some kind of documentation like this on every unsafe block be useful? NOTE: The documentation like that would be on the unsafe block inside the safe wrapper around the call to the unsafe FFI function.

1 Like

Yes, I agree, as mentioned above: How (not) to use unsafe code - #16 by gbutler69

1 Like

To play devil's advocate, should we also add #[tricky_but_safe_code(...)]? I think we'd all agree the answer is no - that code should have normal code comments explaining it. In a lot of ways, unsafe is just a particular class of tricky code. Yes, it also has an effect on Rust's core tenets but it's otherwise no different, in principle at least.

This is why, personally, I'm more interested in understanding why unsafe code was used in the first place, as mentioned. I believe there's more meat on that bone than the different ways to put code comments.

8 Likes

As always, it depends, and there are exceptions. But in lots of cases it'd feel like x++ // add 1 to x comment.

Often functions don't like null pointers, want strings 0-terminated and so on. These feel somewhat obvious and repeating the documentation.

And OTOH there's always a risk that I misunderstood the requirements, so my comment may state insufficient assumptions, misleading the code reviewer that it satisfies required criteria.

In other cases the unsafety isn't even anywhere near the unsafe block. For example init( &struct) kind of functions may have very complex safety requirements depending on content of the struct (e.g. error handler callbacks for libjpeg), so it'd be better to comment in place the struct is built.

5 Likes

You're correct here (IMHO). It is much more important to understand why the author felt so much use of "unsafe" was justified/necessary and try to address those concerns.

Yes, that makes sense and I definitely see your point more clearly now.

With this, though, to me the way I would use your comments would be to see if you thought about the issues at all reasonably. I could check your comments with the underlying FFI definition and if they didn't line up very well, I could know that I need to examine much more deeply to understand if you messed up or I'm just misunderstanding. But, if your comments line up with what the FFI function's contract actually seems to be (from documentation or actually looking at the FF code), then I could have a certain degree of confidence that you were thoughtful about the usage.

That being said, I definitely agree with @vitalyd that the important thing in this discussion is not how best to document/annotate unsafe usage, but, to understand why certain authors of Rust code are tending to over-use unsafe and how that can be prevented/mitigated. Solutions could include any of the following:

* Educational Materials
* Language Changes (Missing Language Features?)
* Standard Library Additions (Missing stdlib Features?) 

One thing that could usefully come out of this "Discovery" that Actix uses too much (?) unsafe inappropriately (?) would be to create a Nomicon chapter or appendix that explains why each of these is inappropriate and what the appropriate solution should be (along with patches/PR's for Actix correcting the issue).

3 Likes