[ANN] `cmov` v0.1: constant-time conditional move intrinsics for x86/x86_64/aarch64

The RustCrypto project has just released a new crate which leverages inline assembly as stabilized in Rust 1.59 to provide guaranteed constant-time conditional move intrinsics when used on x86, x86_64, and aarch64 target architectures:

Conditional moves provide an alternative to branching which is guaranteed to operate in constant time and not subject to architectural sidechannels caused by branch predictors or other speculative execution features.

Prior to the stabilization of inline assembly it was not possible to reliably emit these instructions, as LLVM's x86-cmov-conversion pass could potentially rewrite CMOV instructions as branches, which could occur even if LLVM's optimizer was completely disabled using optnone.

Inline assembly provides a way to sidestep these aspects of LLVM and reliably emit these instructions in a way LLVM's optimization passes will not interfere with.

When used on architectures other than x86, x86_64, or aarch64, the crate provides a portable fallback implementation based on bitwise arithmetic. However, due to the aforementioned problems with LLVM we cannot guarantee that this implementation will operate in constant time. Together however, this means the crate provides guaranteed constant-time behavior on select targets, and "best effort" constant-time-ish operation on others.

We are also interested in potentially expanding the set of platforms that provide constant-time guarantees if there is interest.



Very nice!

One little question: why is usize the type of choice here? Is it to target the native word size on whatever architecture you're using? (so u32 or u64 depending on the arch, most likely)

Yes. The underlying instructions operate on word-sized values.

Good work!

I'm curious, do you think it makes sense to implement a high-level cmov for other architectures as well that do not support it natively (using the standard constant-time construction in inline assembly)?

I'm not sure of a portable solution to that problem, as the instructions being used are all architecture specific (x86 is using the CMOVcc familiy, and AArch64 is using CSEL)

Oh, I was thinking about doing it using the standard b ^ (mask & (a ^ b)) construction for architectures that don’t have cmov or csel, but using inline assembly to ensure that the optimiser did not alter the implementation. You’d have to implement each architecture separately, but you would get the same security guarantees, right?

Would it make sense to include a function which receives the condition as a bool?

Both cmovz and cmovz expects a usize, so something like

cmovz(a > b, a, &mut b)

must be written as

cmovz(if a > b { 0 } else { 1 }, a, &mut b)

Maybe something like cmov(bool, ...):

fn cmov(condition: bool, src: usize, dst: &mut usize) {
    cmovz(if condition { 0 } else { 1 }, src, dst)

That's actually what it already does! That's the "portable fallback implementation based on bitwise arithmetic" described above. However also note the caveat:

due to the aforementioned problems with LLVM we cannot guarantee that this implementation will operate in constant time

Note that they are annotated #[inline(never)] to try to avoid the optimizer inlining the function and being clever enough to rewrite it with branches, but again, this is not a guarantee.

The underlying instructions operate on word-size integers, which is why the API is shaped the way it is.

However, you can use bool_value.into() to convert a boolean into a usize:

let my_bool = true;
cmovnz(my_bool.into(), a, &mut b);

@fegge suggested to wrap it in asm!() black boxes (basically asm!("", in(reg) &mut val); to make LLVM think val is mutated by the asm!() block.), which should be guaranteed to not optimize away, unlike #[inline(never)].

Yeah, using asm! as a sort of dataflow optimization barrier would be interesting

I was actually thinking about implementing the entire fallback logic as an asm! block without optimization barriers. Since the compiler does not provide any guarantees about instruction selection, (even for low level instructions like xor, and, and neg that correspond in a one-to-one manner with assembly instructions) it is free to rewrite a high-level fallback implementation of cmov. Using optimization guards does provide some guarantees, but Implementing the fallback in assembly would provide stronger guarantees that the implementation is preserved.

The (nightly) documentation for inline assembly contains the following which I think suggests that inline assembly would not be touched by the compiler.

The compiler cannot assume that the instructions in the asm are the ones that will actually end up executed. This effectively means that the compiler must treat the asm! as a black box and only take the interface specification into account, not the instructions themselves.

This would make sense for architectures like risc-v and arm without a cmov (or csel) instruction, but which have stable support inline assembly. An example for riscv64 would look something like this.

1 Like

The problem with using asm! for that purpose is that ASM support is stabilized only for x86, ARM and RISC-V.