Size of unsafe blocks

Continuing this discussion from quote of the week; how does everyone prefer to handle the scope of unsafe blocks?

Personally, I enable the unsafe_code lint (at the warn level), and add #![expect(unsafe_code, reason = "..")] at the top of modules with any unsafe (and I keep unsafe blocks small). I’m also a fan of relatively verbose safety comments, and using small unsafe blocks is an easy way to require that each unsafe action gets its own explanation of why its safety invariants are upheld. I suppose it’s still possible for someone to modify other safe code in the module without bothering to read all the comments, but if that happened, my solution would probably be to add more comments to the safe code as well.

My principle: Unsafe block is the minimum collective regions of code that, modifying any code that is not in unsafe block cannot results UB or unsound code.

Reasoning: You should be able to mess with "the safe part" of the code however you want. This is how I interpret "fearless".

3 Likes

I assume you meant code that's outside the region and not in an unsafe block?

1 Like

When thinking about the size of unsafe blocks, I find the first two examples in Working with Unsafe - The Rustonomicon quite useful.

My personal unsafe scoping policy is that I try to minimize:

  1. the number of safe operations within unsafe blocks
  2. the number of unsafe operations in any given function
  3. the size of unsafe traits
  4. the number of safety-critical invariants that callers of unsafe functions must uphold
  5. the number of safety-critical invariants that implementers of unsafe traits must uphold
  6. the size of unsafe modules
  7. the number of (especially non-std) use declarations within unsafe modules
  8. the scope of (especially non-std) use declarations within unsafe modules

where an unsafe module is a module that either contains unsafe code or is imported by an unsafe module. As a result of rule 1, my unsafe blocks are usually pretty small.

Admittedly, this can be a little extreme, but it really helps with debugging. Each of my policies reduces the number of places I have to check when/if I find UB:

  • Rules 1-2 make it easy to figure out which unsafe function/operation is causing UB (assuming I have good tests).
  • Once I've figured out which unsafe function/operation is causing the UB, rules 3-5 make it easy to figure out which invariant is being broken.
  • Once I've figured out which invariant is being broken, rules 6-8 minimize the amount of code I have to read to figure out how the invariant is being broken.

Of my rules, rule 1 is, in my opinion, the least important. Figuring out which unsafe operation was the final cause of the UB is usually the easy part of debugging, at least for me. That said, rule 1 does provide some advantages, and I find most arguments for expanding unsafe blocks to be pretty weak (since, as mentioned in the 'Nomicon page that @rikyborg linked, unsafe code pollutes the whole module).

I only mark the actual, individual unsafe calls.

That's unworkable. Say you have this code:

fn half(n: usize) -> usize {
    n / 2
}

pub fn weird_sum(a: &[u32]) -> u32 {
    let mut s = 0;
    for i in 0..a.len() {
        let j = half(i);
        // SAFETY: j <= i < a.len()
        s += unsafe { a.get_unchecked(j) };
    }
    s
}

According to your policy, you'd have to write:

fn half(n: usize) -> usize {
    unsafe { n / 2 }
}

pub fn weird_sum(a: &[u32]) -> u32 {
    let mut s = 0;
    unsafe {
        for i in 0..a.len() {
            let j = half(i);
            s += a.get_unchecked(j);
        }
    }
    s
}
1 Like

Valid criticism. I'm sure I don't want to do this. But I'm sure I don't want to shrink unsafe block to the mininum that let the compiler happy either.

I need some time to think through this.

Of course, this is just an example. But as an example, it also shows that by inlining (when practicable) the unsafe block can appear “fearless” from the outside:

unsafe {
    for i in 0..a.len() {
        s += a.get_unchecked(i / 2);
    }
}

I’d tend to say that a lot of unsafe usage in practice can be treated this way, which makes the rule, or maybe better, the advice stated by @zirconium-n perfectly usable and contributes to robustness.

Maybe it’s like “good engineering practice” in safe code: write things so they are comprehensible and robust when modified. A comparable rule for unsafe code could be to formulate unsafe blocks so that they are self-contained and insulated from changes around them.

I think unsafe is the wrong abstraction for this kind of encapsulation. Modules are the appropriate encapsulation tool, not unsafe markers.

Say you have a module that defines a type that maintains certain invariants. Moreover, it uses unsafe that relies on the invariants for correctness. This philosophy would imply that all the code in the module must be marked unsafe because it all has access that allows it to mess up the invariants.

A better philosophy is: make a module to encapsulate the minimal functionality required to maintain invariants.

unsafe markers are for when you use your invariants to call dangerous functions in other modules, rather than to mark all the code that has access to your invariants.

4 Likes

I think that some of the differences in unsafe block size preferences might come down to what we're using unsafe for. Most (if not all) unsafe exists on a sliding scale. On one end is code like in your example, where the user is effectively telling the compiler "Hey, in this one instance, you don't need to do a runtime check." On the other end is stuff like very low level data structures and algorithms, where the user is effectively telling the compiler "Trust me, my abstraction around this pile of raw pointer shenanigans is totally sound."

For use cases on the former end of the scale, it may be workable to have "fearless" unsafe blocks like this. However, when I write unsafe code, I'm almost always writing code that's on the latter end of the scale. (Moreover, I suspect that unsafe-to-avoid-runtime-checks will become rarer and rarer as the type system improves in expressiveness.) The more complex and state-dependent the unsafe becomes, the more this "fearless" unsafe block principle becomes unworkable.

To illustrate, take your example, but pretend that a is not an instance of std::vec::Vec, but of some type crate::MyCollection. Suddenly, crate::MyCollection::len is safety critical, and the unsafe block is no longer fearless.

That said, I don't think that these large unsafe code blocks are necessarily bad practice for everyone. I avoid them, but as I said earlier, that's mostly personal preference, and I think the disadvantages of them are relatively minor.

Today I learned of the expect attribute.

Not sure I like the warning being suppressed. The expect would typically not be seen if looking(/editing) at top of a crate. (allow if used also suppresses.)

#![warn(unsafe_code)]
mod bar {
    #![expect(unsafe_code, reason = "it is wrong not having some unsafety")]
    pub fn foo() {
        unsafe { std::arch::asm!(""); }
    }
}
fn main() { bar::foo(); }

That's much more workable approach. Just like with real robot: you normally don't place the operator booth into a caged area where lethal robot is operating, yet you still mark it specially and control who can enter it.

One example from a crate I’ve written, at the top of a module:

#![cfg_attr(
    not(feature = "polonius"),
    expect(unsafe_code, reason = "needed to satisfy NLL borrowck"),
)]

I have the unsafe_code lint globally enabled in my Cargo.toml. Using explicit carve-outs seems like a nice way to lint that I'm using unsafe exactly where and when I intend to. And in this next case, the reason more-or-less explains that there's a specific ManuallyDrop necessitating unsafe somewhere in the module:

#![expect(unsafe_code, reason = "manually drop database lockfile inside Drop impl")]

I guess this relates more to linting preferences rather than unsafe preferences, since I do something similar for indexing_slicing; I don't want to use index_unchecked (even if I could), and... I guess I'm too lazy to use a more functional style. But I want to add more friction to panics to make sure I've thought through my actions, and thus I enable clippy's indexing_slicing lint with expect on each slice index callsite (serving a purpose similar to a SAFETY comment).

That's what allow is for. expect on the other hand is for addition of creating a warning (or error if paired with deny/forbid) if the code is changed and the lint is no longer in effect.

You have used #![warn(clippy::indexing_slicing)] and #[expect(clippy::indexing_slicing)] and nothing is currently reported since compiler is dumb and thinks there could be a panic. In a future compiler, it figures out there is no such panic, now printed is a warning.

edit: (two stones at once. Don't want to add another excessive posting.)
Swapped deny to warn which I intended (was a copy past from playground I missed thinking about.)

p.s. Just add, I do think it's a good technique in use of the indexing_slicing lint. (allow/expect difference is more of a triviality.)

Huh, I'd assumed that indexing_slicing applied when indexing or slicing is used at all, but you're right, the description is "Checks for usage of indexing or slicing that may panic at runtime."

Either way, I normally don't deny Clippy lints (just warn), so I guess I'm fine if a future compiler version can prove it doesn't panic and warns about the expect; it couldn't be a breaking change. But I would probably then need to change it to allow, assuming that it'd take time to percolate back from nightly to MSRV, since writing a full cfg_attr with rustversion seems too tedious.