Zeroing a slice of integers?

Both will likely compile down to the same asm, so it doesn't matter too much. You could also use safemem, which has safemem::write_bytes. This does exactly what you want, and safemem is the standard crate for doing these operations on slices and vecs.

1 Like

There is also zeroize for the specific case of zeroing.

2 Likes

You will have to use write_volatile, other write methods may end up getting optimized away..

Usually write operations aren't optimized away unless there's UB or there's a bigger reason to optimize away.

For example:

fn foo(x: &mut [u32]) {
    for i in x {
        *i = 0;
    }
}

Playground.
Resulting ASM:

playground::foo:
	test	rsi, rsi
	je	.LBB0_2
	push	rax
	mov	rdx, rsi
	shl	rdx, 2
	xor	esi, esi
	call	qword ptr [rip + memset@GOTPCREL]
	add	rsp, 8
3 Likes

They can get optimized away when the value is freed after being zeroed, which is a common pattern for buffers containing secrets.

https://en.cppreference.com/w/c/string/byte/memset#Notes

2 Likes

Just using a for loop, as @OptimisticPeach showed, is the right way to do this. There's not reason to jump to unsafe to do it -- there's even a test ensuring that it keeps working: https://github.com/rust-lang/rust/blob/master/src/test/codegen/issue-45466.rs

2 Likes

It certainly can be optimized out, see this example. In your example you operate over a mutable reference, so compiler does not know if the buffer contents will be read or not after the function call. But if after inlining it decides that no one will read those zeroes, then it will happily remove those "unnecessary" writes.

An RFC is created for a fill method on slice.

You can find a discution about setting a slice to a scalar value.

1 Like

may end up getting optimized away..

zeroing because you need the values to be read as zero (hyper-optimized if possible), or zeroing for "privacy purposes" on cleanup are separate problems and need a different solution

in the first case you don't have to be worried about the zeroing being optimized away, it would result in noticeably incorrect behavior

the second is very hard to get right, it's basically a fight against the compiler (because it considers the result unobservable), in C there are compiler specific kludges like SecureZeroMemory—don't know about rust

4 Likes

Why? I you read the struct you don't need to. If you don't read then don't use volatile and let the compiler do it's job. If you just write for nothing let the compiler remove the write. This is not the case in multithreaded context of course, but even there, this depend of the code and it's very very rare to use volatile.
Last time I need it in C++ is when developing lock-free data structures... And this is not straighforward.

1 Like

Their prepend function is theoretically unsound, though :grimacing:

2 Likes

For my zero_me function given in the first post it just jumps to memset. I'm not sure what causes the difference, but that might just be a consequence of my artificial test code. Either way they're not so different that it's worth me trying to pessimistically optimise :slight_smile: .

playground::zero_me:
	.cfi_startproc
	lea	rdx, [4*rsi]
	xor	esi, esi
	jmp	qword ptr [rip + memset@GOTPCREL]

EDIT: More refined testing shows that, in this case, it all gets shrunk down to the following, no matter which method you used:

mov	dword ptr [rsp + 24], 0
mov	qword ptr [rsp + 16], 0

In the case of zeroing larger slices the results are similar and using unsafe code doesn't change anything.

xorps	xmm0, xmm0
movups	xmmword ptr [rsp + 96], xmm0
movups	xmmword ptr [rsp + 84], xmm0
movups	xmmword ptr [rsp + 68], xmm0
movups	xmmword ptr [rsp + 52], xmm0
movups	xmmword ptr [rsp + 36], xmm0

So there really is no difference at all.

My first experience with Rust was that it is very good at optimizing a range iteration:

You might want to rely on this auto vectorization if speed is what you're after.

ptr::write_bytes would have been a convenient one-liner if expressivity is your goal, but it requires an unsafe block, so that reduces the appeal for me.

1 Like

I've marked OptimisticPeach's post as the answer because in this case I don't need to securely zero the slice and unsafe code is indeed unneeded. However if security is required than a crate like zeroize might be the better option (as mentioned by CAD97).

use num_traits::Zero;
use rayon::prelude::*;

fn zero<T: Zero + Send>(a: &mut [T]) {
    a.par_iter_mut().for_each(|x| *x = T::zero());
}

fn main() {
    let mut v: Vec<i32> = (0..20).collect();
    zero(&mut v[0..10]);
    println!("{:?}",v);
}
2 Likes

Up to 64 processors writing into the same cache line. There can be a lot of contention.

When working with secret data you want it wiped from memory after it has been used so that an attacker can't get it via out-of-bounds reads, reads from uninitialized memory, Spectre and similar attacks. It narrows the window of opportunity for attacks.

It is a common pattern to zero immediately before dropping the memory that holds the secrets. If the compiler sees a memset (or code patterns that optimize to that) followed by a free then it can optimize the memset away, thus nullifying your defense.

In this case ( which is the only relevant I'm agreed ) I hope they don't let the programmer remember that he have to zeroing the memory and they do not allowed the usage of malloc/free and instead they provide a safe drop trait in Rust or a safe unique and shared ptr in C++ that use a safe free of C that always zeroing before . But in the current thread there is no need for that

Ok, that was a quickly merged PR! Fixed as of the 0.3.3 release :slight_smile:

4 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.