Passing a mutable and immutable reference to a function

Hello everyone!

I am currently writing a naive implementation of a block cipher. It is not supposed to be fast, I just want to implement the specification for now.
Given that I created an XOR function on cipher blocks that looks like this:

fn xor_block(dest: &mut [u8], a: &[u8], b: &[u8]) {
    for i in 0..16 {
        dest[i] = a[i] ^ b[i];
    }
}

Most of you will at this point realize that calling this function is fine if dest, a and b are distinct entities. However, every now and then I need to call xor_block(&mut x, &x, &y), which obviously results in the borrow checker yelling at me, because we can't have a mutable and immutable reference to the same object in the same scope.
Currently I am using a work around by making a copy of x beforehand and then calling xor_block(&mut x, &x2, &y), though this is not an ideal solution, which is why I wanted to ask you all for some insight/help into how I can resolve this situation properly.

Also I am not willing to turn the mutable reference into a return value, because this function is called a lot; I don't want to allocate a new buffer every time it is called.

1 Like

How about defining a separate function fn xor_block2(dest: &mut [u8], b: &[u8])? playground

2 Likes

Well sure that works too but its also not very nice for readability if I use two functions that essentially do the same but are called differently.

Well, you can define this:

use std::cell::Cell;

fn xor_block(dest: &[Cell<u8>], a: &[Cell<u8>], b: &[Cell<u8>]) {
    for i in 0..16 {
        dest[i].set(a[i].get() ^ b[i].get());
    }
}

It's possible to convert from &mut [u8] -> &[Cell<u8>], so you would be able to call it in your case.

However, annoyingly you cannot convert &[u8] -> &[Cell<u8>], so you still cant use this with the original method. A slice thats like &[u8] but allows other references to the same memory to modify the data is somewhat missing in the language.

You could define your own like this. This is unsafe, but the unsafe code is correct and using this Slice type cannot lead to UB. (Changing the T: Copy to T: Clone would not be correct.)

For this particular use-case, I would probably just define the function twice. Having xor_block and xor_block_inplace isn't that bad.

8 Likes

It also might be fastest — the compiler can generate machine code that uses the no-aliasing guarantee for one, and is explicitly in-place for the other, whereas the Cell version has less info to work with and probably has to read one element at a time instead of getting auto-vectorized.

(Just speculation; haven't tested.)

6 Likes

Option is a choice too, but two separate functions are clearer for me.

fn xor_block3(dest: &mut [u8], a: Option<&[u8]>, b: &[u8]) {
    for i in 0..16 {
        dest[i] = a.unwrap_or(dest)[i] ^ b[i];
    }
}

playground

2 Likes

I guess I will go with the xor_block and xor_block_inplace option then.
Though I took a look at how RustCrypto solved this problem in their aes implementation and they seem to be using a custom struct called InOut, which essentially holds a mutable pointer out and an immutable pointer in, that points to the buffer we want to read and mutate. However this is clearly unsafe code and I don't know if I want to go this deep just for a naive implementation.

Their InOut struct is equivalent to a &[Cell<u8>] and Slice together using my Slice utility, except that they share a length field.

4 Likes
#[inline(never)]
#[no_mangle]
fn xor_block2(a: &[u8], b: &[u8]) -> [u8; 16] {
    let mut dest = [0u8; 16];
    for i in 0..16 {
        dest[i] = a[i] ^ b[i];
    }
    dest
}

#[inline(never)]
#[no_mangle]
fn test2() {
    let mut b = [0u8; 16];
    let c = [0u8; 16];
    
    for _ in 0..10 {
        b = xor_block2(&b, &c);
    }
}

This on godbolt looks to be pretty optimal to me (smaller assembly than the 3 arg version you listed). Granted, it's with a fixed array size (for you the 16 kernel implies as much). I think this is your best bet.. bench it and prove otherwise.. Note, in rust, returning a tuple or array is actually passing IN a mutable return value that is purposefully de-aliased from the input parameters.. So if you pass in something and return it again, the unwound outer loop is just going to towers-of-hanoi the shit out if. e.g. zero copies. better-so if inlined.

2 Likes

Oh well the compiler and its optimizations never fail to surprise me. Thank you very much for the insight!

One trick is to add a bounds check before your loops (e.g. use the assert macro). This can result in better optimization because checking it inside the loop has issues - you're still partially modifying the array even if you panic.

2 Likes

I think returning an array is probably a better solution here, but it's interesting to note that you don't actually need both immutable and mutable access to any element at the same time. Since you're going in order and reading a Copy value before you write back to the memory location, you can actually set this up to work with a trait

code
fn xor_block<T: ValueAndMut>(mut a: T, b: &[u8]) {
    for i in 0..16 {
        let (out, a) = a.next();
        *out = a ^ b[i];
    }
}

pub trait ValueAndMut {
    fn next(&mut self) -> (&mut u8, u8);
}

pub struct Unique<'d, 's> {
    dest: &'d mut [u8],
    src: &'s [u8],
    index: usize,
}

impl<'d, 's> ValueAndMut for Unique<'d, 's> {
    fn next(&mut self) -> (&mut u8, u8) {
        let idx = self.index;
        self.index += 1;

        (&mut self.dest[idx], self.src[idx])
    }
}

pub struct Aliased<'a> {
    data: &'a mut [u8],
    index: usize,
}

impl<'a> ValueAndMut for Aliased<'a> {
    fn next(&mut self) -> (&mut u8, u8) {
        let idx = self.index;
        self.index += 1;

        let value = self.data[idx];
        (&mut self.data[idx], value)
    }
}

fn main() {
    let mut out = [0; 16];
    let mut x = [0; 16];
    let y = [0; 16];

    xor_block(
        Unique {
            dest: &mut out,
            src: &x,
            index: 0,
        },
        &y,
    );

    xor_block(
        Aliased {
            data: &mut x,
            index: 0,
        },
        &y,
    );
}

Playground

Note: I gave ValueAndMut an Iterator-like interface to avoid being able to accidentally go backwards and get a previously mutated value. You could probably actually implement Iterator by using std's mutable slice iteration.

I probably wouldn't recommend actually using this, but it's a good exercise in getting the compiler to write those two different implementations for you!

(I have no idea how the codegen compares here, I wouldn't be surprised if this version ended up being slower though)

EDIT: Here's a mildly cursed iterator version

Playground

code
use rand::{thread_rng, Fill};

fn xor_block<'a, T: Iterator<Item = (u8, u8, &'a mut u8)>>(iter: T) {
    for (a, b, out) in iter {
        *out = a ^ b;
    }
}

fn main() {
    let mut rng = thread_rng();
    let mut out = [0u8; 16];
    let mut x = [0u8; 16];
    let mut y = [0u8; 16];

    x.try_fill(&mut rng).unwrap();
    y.try_fill(&mut rng).unwrap();

    xor_block(
        out.iter_mut()
            .zip(&x)
            .zip(&y)
            .map(|((out, x), y)| (*x, *y, out)),
    );

    xor_block(x.iter_mut().zip(&y).map(|(x, y)| (*x, *y, x)));

    assert_eq!(out, x);
}
2 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.