Use unsafe tag in functions that are not memory unsafe

I was recently working on speed up the serialization of some heavy data structures.

And to do so, we basically removed some checks that were verifying that the structure is indeed constructed correctly and it's valid (not talking about check the byte-slice len or UTF-8 correctness. But about that the structure makes sense in it's context (Elliptic Curve elements)).

Then, once we did so, this was more or less the result:

    /// Create a `G1Affine` from a set of bytes created by `to_bytes_unchecked`.
    ///
    /// No check is performed and no constant time is granted. The `infinity` attribute is also
    /// lost. The expected usage of this function is for trusted bytes where performance is
    /// critical.
    ///
    /// For secure serialization, check `to_bytes`
    /// ## Panics
    /// If the slice passed has `len < 96`.
    pub unsafe fn from_slice_unchecked(bytes: &[u8]) -> Self {
        let mut x = [0u64; 6];
        let mut y = [0u64; 6];
        let mut z = [0u8; 8];

        bytes
            .chunks_exact(8)
            .zip(x.iter_mut().chain(y.iter_mut()))
            .for_each(|(c, n)| {
                z.copy_from_slice(c);
                *n = u64::from_le_bytes(z);
            });

        let x = Fp::from_raw_unchecked(x);
        let y = Fp::from_raw_unchecked(y);
        let infinity = 0u8.into();

        Self { x, y, infinity }
    }

So the question comes by itself:
Is this usage of unsafe the right choice? Since after checking rust unsafe book I saw the following:

In addition, unsafe does not mean the code inside the block is necessarily dangerous or that it will definitely have memory safety problems: the intent is that as the programmer, you’ll ensure the code inside an unsafe block will access memory in a valid way.

But this makes me thing about the unsafe keyword. And the fact that if unsafe is used to mark things that are not only related to memory safety strictly, it kinda looses it's point and makes code review and read more confusing.

Which are your opinions on that? Is it clearly stated somewhere that unsafe can/can't/should/shouldn't be used there?

It depends on the safety invariants of Self. If you declare that it is undefined behavior for your type to contain an invalid curve, then it is undefined behavior. If you don't declare so, then it isn't.

The two choices have the following consequences:

  1. If you do not declare it to be UB, then from_slice_unchecked should not be marked unsafe, but you may not write unsafe code anywhere else that makes the assumption that the curve is valid. I.e. all unsafe code must be correct even if the curve is invalid.
  2. If you declare it to be UB, then from_slice_unchecked must be marked unsafe. In this case, you may write unsafe code that would be incorrect if the curve is invalid.

Note how in either case, the user is not able to cause undefined behavior when using your library without an unsafe block of their own.

  1. In the first case, your library has no unsafe code that can cause trouble if the curve is invalid, so it's all good.
  2. In the second case, the user had to use an unsafe block to call from_slice_unchecked, so it is not a problem that your library has unsafe code that can later cause UB using this curve — it's the user's fault as they triggered UB the instant they called from_slice_unchecked with an invalid curve.
3 Likes

I'll cite myself from an IRLO thread last week:

I like the idea that @alice brought about UB. It makes sense to mark unsafe on this way. But it's also true that what can be UB for my lib (in the context of the lib) may not be (as it happens in this case) UB within Rust itself.
So I'm still kinda lost.

Thanks for the link. Very interesting ideas are shown there. I added my case and some ideas. Sincerely I think this is not something that has to be addressed by the compiler itself, but via a better documentation using RustDoc.

I think of unsafe as primarily a guard against heisenbugs: If changing optimization settings, OS versions, or similar far-away things might change the program's behavior, it's unsafe. Otherwise, you just need to document the expected behavior in prose.

The general consensus seems to be that you should only ever use unsafe for memory and thread-safety issues. Using it elsewhere dilutes its usefulness for reasoning about UB.

For other invariants where breaking those invariants may lead to bugs or panics you'll normally do something like this:

struct Foo { ... }

impl Foo {
  pub fn from_slice(bytes: &[u8]) -> Self {
    assert!(bytes.len() >= 96);
    assert!(bytes.len() % 8 == 0);

    Foo::from_slice_unchecked(bytes)
  }

  pub fn from_slice_unchecked(bytes: &[u8]) -> Self {
    // You may choose to throw in a debug_assert!() here to help
    // detect incorrect uses of the unchecked constructor while 
    // developing, depending on what information your constructor's
    // args have access to.
    debug_assert!(bytes.len() >= 96);
    debug_assert!(bytes.len() % 8 == 0);

    ...
  }
}