Efficiently replacing value with successor in mutable borrowed struct


#1

I am replacing a value in a mutable borrowed struct like this:

extern crate num;

use std::ops::Add;
use std::mem::replace;
use std::fmt::{Display, Formatter, Result};
use self::num::One;

struct Test(T);

impl Display for Test {
    fn fmt(&self, f: &mut Formatter) -> Result {
        write!(f, "{}", self.0)
    }
}

fn inc<T : Add + One + Default>(t : &mut Test) {
    let y = replace(&mut t.0, T::default());
    replace(&mut t.0, y + T::one());
}

fn main() {
    let mut t = Test(0i32);
    inc(&mut t);
    println!("{}", t);
}

However replacing with the default constructed value and then replacing again with the sum seems inefficient, especially for types that are expensive to construct. These are the key lines:

let y = replace(&mut t.0, T::default());
replace(&mut t.0, y + T::one());

What is the efficient way to do this?


#2

Stabilized in the next release is AddAssign, which lets you add in-place like this: http://is.gd/vKIuzn

It’s usable with beta and nightly right now. Granted, this means third-party types will need to implement it if they don’t already, but they should anyways.


#3

Add was just an example. The type may not be integer, and the operation may not be increment. This as just the easiest way to show the example.


#4

You could change the type to use Option<T> internally, since it’s usually pretty cheap to construct a None (pathological cases aside).

Fundamentally, you have to swap something in, just in case your program panics between pulling the value out and putting the new one back.


#5

Is there any way to use an unsafe block to wrap the operation?


#6

unsafe cannot magic away the need to have something valid in t.0. It is possible to unsafely work around this, but doing so would be a really bad idea, because you cannot possibly know whether or not T::add can panic or not.

If you really, desperately need to do this (as in, it’s actually a problem, not just a hypothetical one), you can define an unsafe trait whose contract specifically states that it has to implement the add operation without the possibility of panicking. If you’re going to go down that road, you need to read and internalise The Rustonomicon first so you understand what you’re doing, and what invariants the compiler requires you to uphold.


#7

How is AddAssign implemented?


#8

That’s up to the individual types implementing it.


#9

But how does it avoid the move problem? Edit (answering my own question, because it takes the LHS as a mutable reference).

What I need is a ‘MutateAssign’ that takes a closure that does the mutation?


#10

AddAssign is not a single thing; it’s just an interface description. Every type has to implement it differently. Any type implementing it efficiently will just be doing direct mutation on the values. If AddAssign was stable, you’d just use that instead and it would work fine.

Your fundamental problem is that you’re trying to perform an operation that requires you to move the value out of its old storage. There is no safe workaround for this that doesn’t involve copying or constructing a temporary value, and the unsafe solution is even more complicated.

If we relax the question, the simplest solution (aside from ignoring the problem since you’re only working with i32, which is trivial to copy) is to require an implementation of some Increment trait for every type.


#11

Perhaps if I introduce some of the real problem, it might help come to a solution.

I have a stream iterator, that is for a single-pass stream, that is iterators cannot be copied so you cannot hold a reference to anything but the head of the stream. To model this I have a a non-copyable, non-cloneable iterator, with a successor function that looks like this:

pub trait Iterator : PartialEq {
    type distance_type : Integer;
    fn successor(self) -> Self where Self : Sized;
}

The important property of the iterator is that there can only be one stream head in the buffer, so any copy of an iterator would become invalid when it is advanced. Because you cannot copy or clone the iterator, and because successor consumes the iterator, you are guaranteed to only have one stream head.

I now have a mutable struct that contains various information about the stream state and is uncopyable and unclonable (because of associated resources), which is passed by mutable reference, and I want to increment the iterator contained in the struct (to advance the stream).

Advancing the iterator is performance critical, hence although it can be done using replace, a way to do it as fast as C/C++ is the target.

Edit: I think I agree that the simplest solution is to have some kind of Increment trait.


#12

I guess I am still puzzled by this question: Why can you do this:

let mut x = MyIterator::new();
x = x.successor;

But not this (bearing in mind that x is on the stack above):

struct Test(T);
let mut t = Test(MyIterator::new());
t.0 = t.0.successor

In both cases we are replacing the iterator in storage, in the first case it happens that ‘x’ is on the stack, and in t.0 is in a structure. However the stack is also clearly a memory structure, why should the two cases behave differently.


#13

Wouldn’t something like this work?

pub trait Iterator : PartialEq {
    type distance_type : Integer;
    fn advance(&mut self);
}

That requires every reference to the stream/iterator to be released before advance can be called, so it should have the same effect (unless I’m forgetting something).


#14

I think that will work, but I need the other definition as well, so I end up with two functions in the trait instead of one. Thats okay, and I can use that.

My question is more semantic now, why is it okay to replace the value of ‘x’ (when ‘x’ lives in a slice of stack) but not replace the value of t.0 (when 0, lives in t, which is also a slice of stack). It would appear to me any argument about replacing ‘t.0’ being unsafe would also apply to ‘x’, and yet we allow ‘x’ to be replaced. Why?


#15

Hmm… I see. Well, you can at least default implement successor, based on advance:

pub trait Iterator : PartialEq {
    type distance_type : Integer;
    fn advance(&mut self);
    fn successor(mut self) -> Self where Self: Sized {
        self.advance();
        self
    }
}

I think it’s because when replacing x, you are replacing the whole value, while when replacing t.0, you are just replacing a part of the value. This could become a problem if the replacement fails, because you may end up dropping the replaced value twice. Once in the failing replacement procedure, and once again when dropping t if the value is still left in memory.

This doesn’t apply to every situation, but I guess it’s better to be safe than sorry. I may also have gotten this wrong, so please correct me if that’s the case.


#16

Yes, thats a good idea I hadn’t thought of, successor defined in terms of increment works fine, and is probably the way I am going to do it. However, I have come up with some other options not mentioned yet:

unsafe {
    let y = mem::replace(&mut x.0, mem::uninitialized());
    mem::replace(&mut x.0, y.successor());
}

Or

unsafe {
    let mut y : I = mem::uninitialized();
    mem::swap(&mut x.0, &mut y);
    mem::swap(&mut x.0, &mut y.successor());
}

But I think defining successor in terms of increment/advance is the best option as it doesn’t include any unsafe operations.


#17

Both of those are sketchy. They will end up freeing garbage data from uninitialized. The first one returns it from replace, so it will be immediately cleaned up in that statement. The second one ends up putting into a temporary value in the last statement, which is then dropped. Rust can’t know that it’s an uninitialized value, since it’s masked as a valid value, and will treat it as such.


#18

Good catch with the freeing garbage, here’s a solution, and I have also cleaned up the function into a generic one. I think I prefer the replace version, so I have only updated that one, but forget could also be used with swap.

pub fn replace_using(x : &mut V, mut p : P) where P : FnMut(V) -> V {
    unsafe {
        let y = p(mem::replace(x, mem::uninitialized()));
        mem::forget(mem::replace(x, y));
    }
}

I can now implement the function I wanted like this:

pub fn increment(x : &mut It) {
    replace_using(&mut x.0, |y| y.successor());
}

I wonder whether the machine code generated is optimal?


#19

Note that this is exactly what I warned you about: you cannot ensure p does not panic when invoked, which can lead to absolutely anything happening if something goes wrong. This is not safe.


#20

Can we not ask for a guarantee from ‘p’ that it is exception free. Normally ‘swap’ is considered a safe operation that cannot panic?

In any case if it does panic, what can we do about it? Its game over whether the value in the struct is undefined or not?