Unsafe code review request for the inout crate

newpavlov · November 23, 2021, 6:23pm

In RustCrypto we would like to generalize over in-place and buffer-to-buffer modes of operation in our cipher crates. For this purpose I've designed the inout crate which contains several custom reference types built on top of raw pointers. As expected, the crate contains a significant amount of unsafe code. I've done my best to provide safe and robust API without sacrificing potential performance and to my best knowledge the crate does not contain any unsound code, but it would be nice to get a double check from a pair of fresh eyes.

You can find the crate code in this PR: Add inout crate and prepare block-padding v0.3 by newpavlov · Pull Request #675 · RustCrypto/utils · GitHub

Thank you in advance!

steffahn · November 23, 2021, 6:53pm

For everyone who wants to read the documentation of that crate:

AFAICT the best way is something like

git clone -b add_inout https://github.com/RustCrypto/utils.git && cd utils/inout && cargo doc --open

steffahn · November 23, 2021, 7:10pm

impl<'a, T> InOutBuf<'a, T> {
    /// Create `InOutBuf` from a single mutable reference.
    pub fn from_mut(val: &mut T) -> Self {
        Self {
            in_ptr: val as *mut T as *const T,
            out_ptr: val as *mut T,
            len: 1,
            _pd: PhantomData,
        }
    }

seems like a lifetime might be missing?

steffahn · November 23, 2021, 7:16pm

Seems like a type like InOut<'a, T> could also be implemented with an enum, with variants Same(&'a mut T) and Separate(&'a T, &'a mut T). I suppose performance reasons are why this crate exists (and uses unsafe code)? If so, at least an API-soundness could be achieved by providing an alternative unsafe-free implementations with enums. Perhaps with feature-gates. This way an oversight like a missed lifetime annotation becomes impossible.

It’d also be a good instrument to demonstrate the usefulness of the unsafe code in the first place (since performance comparison becomes trivial).

Edit:

While we’re already at it, looking at something like

            in_ptr: val as *mut T as *const T,
            out_ptr: val as *mut T,

I’m not sure, but it might be necessary to create the pointer first, once, and then copy the pointer.

newpavlov · November 23, 2021, 7:26pm

Thank you! I've applied your patch with the lifetime fix.

It would introduce branching into the very heart of our hot loop code and additional register usage to store the enum tag. Granted, it's a very easy case for branch predictor, but I still would like to get the best possible performance at this level even at the cost of a more complex code.

Also it would make harder to wrap API into C API.

I don't think so. val has type &mut T, so its address should be stable. Though to be extra-safe I will create a temporary *mut T as you suggest.

steffahn · November 23, 2021, 7:28pm

Just my intuition on how stacked-borrows work. Creating the second pointer from the mutable reference might invalidate the first one. Certainly it can’t hurt to create the pointer first and copy it.

steffahn · November 23, 2021, 7:32pm

Testing something like

fn main() {
    let mut x = 42;
    let y = &mut x;
    let z1: *const i32 = y;
    let z2: *mut i32 = y;

    let _x = unsafe { *z1 };
    unsafe { *z2 = 1 };
    let _x = unsafe { *z1 };
}

miri seems to agree with

MIRIFLAGS="-Zmiri-track-raw-pointers" cargo +nightly miri run

Compared to


fn main() {
    let mut x = 42;
    let y = &mut x;
    let z: *mut i32 = y;
    let z1: *const i32 = z;
    let z2: *mut i32 = z;

    let _x = unsafe { *z1 };
    unsafe { *z2 = 1 };
    let _x = unsafe { *z1 };
}

newpavlov · November 23, 2021, 7:43pm

Hm, the following code passes MIRI without issues:

fn main() {
    let mut x = 42;
    let y = &mut x;
    let z1 = y as *mut i32 as *const i32;
    let z2 = y as *mut i32;

    let _x = unsafe { *z1 };
    unsafe { *z2 = 1 };
    let _x = unsafe { *z1 };
}

steffahn · November 23, 2021, 7:46pm

Interesting.

CAD97 · November 23, 2021, 7:47pm

Just to check, are you running with MIRIFLAGS="-Zmiri-track-raw-pointers"?

steffahn · November 23, 2021, 7:47pm

Yes, I tested it.

steffahn · November 23, 2021, 7:49pm

In any case, the difference is subtle enough that it’s still worth taking the clearly uncontroversial approach of creating the pointer first and copying that, IMO

CAD97 · November 23, 2021, 7:51pm

I think the operational difference is that you've written let z: *mut _ = y;, as opposed to let z = y as *mut _;. The former "uses" y as a mutable reference, the latter only as a pointer.

I don't think these should behave differently, though. Perhaps this is a side effect of what allows &raw to work? (Or rather, what allows &place as *const _ to not manifest a reference.)

cc @RalfJung

steffahn · November 23, 2021, 7:53pm

From my tests, the relevant difference seems to be whether a *mut T is created first, or &mut T “directly” turns into *const T.

CAD97 · November 23, 2021, 8:16pm

That holds with my hypothesis; (x: &mut _) as *const _ coerces the &mut to *mut before casting to *const. Or maybe it coerces to & and then casts to *const, I'm not sure.

Either way, it seems that let y = x as *mut _ is not doing an SB retag operation, whereas let y: *mut _ = x is.

steffahn · November 23, 2021, 8:35pm

Maybe I wasn’t clear. I can observe no difference in behavior of miri between let y: *mut _ = x and let y = x as *mut _ at all. The only difference I can observe is between whether *const _ is created from &mut _ directly or whether *mut _ is created first. Any choice between explicit casting with as or coercing with a type signature seems to be insignificant. I haven’t tested / compared with creating &_ from &mut _ first to get a *const _. I won’t be doing any further tests on this today.

RalfJung · November 24, 2021, 5:23pm

Those seem to generate the same MIR, so that should not make a difference.

That definitely makes a difference: the generated MIR for the latter case will turn &mut T into &T first and then turn that into *const T.

*const vs *raw does matter when converting from an &mut; with *const the result is a read-only pointer that may not be used for writing. Basically, when you cross the boundary between references and pointers, you have to use the right type (*const vs *mut, and alkso the right T to determine how much memory is accessible by this raw ptr, and interior mutability in case of *const). Afterwards, the raw ptr type no longer matters. Also see:

github.com/rust-lang/rust

Coercing &mut to *const should not create a shared reference

opened 04:59PM - 07 Dec 18 UTC

RalfJung

T-lang A-raw-pointers

It has long been a rule in Rust that you must not mutate through a shared refere…nce, or a raw pointer obtained from a shared reference. Unfortunately, that rule currently forbids the following code: ```rust fn direct_mut_to_const_raw() { let x = &mut 0; let y: *const i32 = x; unsafe { *(y as *mut i32) = 1; } assert_eq!(*x, 1); } ``` The reason for this is that coercing `&mut T` to `*const T` implicitly first creates a shared reference and then coerces that to `*const T`, meaning `y` in the example above is technically a raw pointer obtained from a shared reference. We should fix our coercion logic to no longer create this intermediate shared reference. See https://github.com/rust-lang/rust/pull/56161 for how we uncovered this problem. Cc @eddyb @nikomatsakis

steffahn · November 24, 2021, 5:31pm

Okay, so the “turn into &T first” part makes sense. What I’m not fully understanding is how

fn main() {
    let mut x = 42;
    let y = &mut x;
    let z1 = y as *mut i32;
    let z2 = y as *mut i32;
    unsafe { *z1 = 1 };
    unsafe { *z2 = 1 };
    unsafe { *z1 = 1 };
    unsafe { *z2 = 1 };
}

is not a problem according to miri. What is the value being coerced into *mut i32 here when writing y as *mut i32? It doesn’t seem to be a re-borrow, because

fn main() {
    let mut x = 42;
    let y = &mut x;
    let z1 = &mut x;
    let z1 = z1 as *mut i32;
    let z2 = &mut x;
    let z2 = z2 as *mut i32;
    unsafe { *z1 = 1 };
    unsafe { *z2 = 1 };
    unsafe { *z1 = 1 };
    unsafe { *z2 = 1 };
}

is not accepted by MIRIFLAGS="-Zmiri-track-raw-pointers" cargo +nightly miri run. Is this difference genuine or is the former code falsely accepted by miri? Or is y as *mut i32 somehow special in not consuming the passed reference and also reproducibly creating copies of the same pointer that are allowed to alias? This seems a bit dangerous to me because re-borrowing mutable references happens implicitly all the time, so for example

fn main() {
    let mut x = 42;
    let y = &mut x;
    let z1 = y as &mut i32 as *mut i32;
    let z2 = y as *mut i32;
    unsafe { *z1 = 1 };
    unsafe { *z2 = 1 };
    unsafe { *z1 = 1 };
    unsafe { *z2 = 1 };
}

is rejected by miri, too.

Yandros · November 24, 2021, 5:47pm

FWIW, I find this surprising as well. The only explanation would be that the MIR lowering (it looks like y as *mut i32 acts as a ptr::addr_of_mut!(*y)) wouldn't count as a "full usage" of y, whereby not activating its assertion of unicity, and directly yielding a pointer with "Shared ReadWrite" provenance.

It mainly means that (y) as i32 features different semantics than those of {y} as *mut i32 or { &mut *y } as *mut i32 which, I think, can be legitimately surprising, but it also kind of makes sense when we think about it.

steffahn · November 24, 2021, 5:54pm

If foo as *mut i32 is really supposed to be equivalent to ptr::addr_of_mut!(*y), and not equivalent to { &mut *y } as *mut i32, then perhaps, we have to add as-casting (to a pointer type) to the list of place expression contexts?

Topic		Replies	Views
Code Review - First Rust Crate code review	16	1308	June 23, 2021
How (not) to use unsafe code	45	4986	January 12, 2023
How unsafe is unsafe? help	17	1257	January 12, 2023
Pure Rust and Safe badges	32	6275	January 12, 2023
New crate: include_data. Review for soundness and API code review	7	425	October 9, 2023

Unsafe code review request for the inout crate

Related Topics