Please help me let me keep my allocation

A small nonsense function (minified from my actual problem): keep the even numbers in the first vector and the odds in the second.

For whatever reason, we don't want to do this inplace. The function uses the tmp Vec for this and for some reason, it stores references to the elements instead copying the values.
However, one optimization is desired: the allocation for tmp should be reused for the second input.

The code below does not pass the borrowchecker and I know why (to the checker, tmp is not guaranteed to be empty and would contain empty references). I could do tmp = Vec::new() but this would give me an empty vector.

Is there some way that I can use the same allocation for the second operation?

Code:

fn foo(input1: &mut Vec<i32>, input2: &mut Vec<i32>) {

    let mut tmp = Vec::<&i32>::new();

    tmp.extend(input1.iter().filter(|i| *i % 2 == 0));
    let output1 = tmp.drain(..).copied().collect();
    *input1 = output1;

    tmp.extend(input2.iter().filter(|i| *i % 2 == 1));
    let output2 = tmp.drain(..).copied().collect();
    *input2 = output2;   
}

playground

fn foo(input1: &mut Vec<i32>, input2: &mut Vec<i32>) {
    input1.retain(|i| *i % 2 == 0);
    input2.retain(|i| *i % 2 == 1);
}

The problem is minified from the original problem. Point is: I have a tmp-vec that contains borrowed items and I would like to reuse its allocation to store references to another location after the first computation is finished

What about

std::mem::replace(input2,tmp);

Edit:
Never mind, just saw that the types are different. Didn't notice the reference in tmp

I don't know if this would work in your real code, but your minimized version will compile if you re-order it to mutate the inputs after you are done iterating through both of them:

let mut tmp = Vec::<&i32>::new();

tmp.extend(input1.iter().filter(|i| *i % 2 == 0));
let output1 = tmp.drain(..).copied().collect();

tmp.extend(input2.iter().filter(|i| *i % 2 == 1));
let output2 = tmp.drain(..).copied().collect();

*input1 = output1;
*input2 = output2;
2 Likes

Wouldn't just tmp.clear() do what you want?

This has no effect, since the issue is at the type-level and the borrow checker has no idea what the clear function does

How about a bit of unsafe code?

fn reuse_vec<'a, 'b, T>(mut v: Vec<&'a T>) -> Vec<&'b T> {
    v.clear();
    unsafe {
        std::mem::transmute(v)
    }
}

As T remains the same, the internal layout of the vec does too.
Clearing it ensures that nothing inside changes lifetimes.

4 Likes

I was thinking about something very similar. A lifetime-extension on an empty-vec-of-references would do the trick.
Is there something in the stdlib or any other major lib that I could use instead of writing unsafe code directly on my own?

1 Like

This is also unsafe, but instead of messing with lifetimes, you just copy through pointers. I put it into a function which I'm not sure if thats desirable or not. I tried to make it generic but its sloppy because I don't know well enough how to make generics work with function passing.


fn replace_filter<F>(tmp: &mut Vec<*const i32>, input: &mut Vec<i32>, mut f: F) where F: 'static + FnMut(&i32) -> bool {
    tmp.extend(input.iter().filter(|i| f(i)).map(|i| i as *const i32));
    let output = tmp.drain(..).map(|i| unsafe{ i.read() }).collect();
    *input = output;
}

fn foo(input1: &mut Vec<i32>, input2: &mut Vec<i32>) {

    let mut tmp = Vec::<*const i32>::new();

    replace_filter(&mut tmp, input1, |i| *i % 2 == 0);
    replace_filter(&mut tmp, input2, |i| *i % 2 == 1);
}

Playground

@drewkett
I don't think getting rid of lifetimes is the answer to that…

I don't know.
But I don't think adding reuse_vec to a (unsafe_) utils module should be a problem.

Thanks, @s3bk. Didn't really appreciate what your function was doing until I read it again.

Is this transmute really safe? the nomicon says transmutes between non repr(c) types is UB.

1 Like

It is a transmute between the same type [edit: with different lifetimes]. (Different lifetimes don't change [edit: the memory layout of a] a type).
You could also call into_raw_parts() and from_raw_parts() if you want to make 200% sure.

I’m not sure this is an accurate statement. Different lifetimes don’t change the memory layout, but changing a lifetime with transmute is (in general) an unsafe operation. This compiles just fine and won’t immediately crash your program when called, but is still a really bad idea:

// DON’T DO THIS; WILDLY UNSAFE
fn make_static<'a, T>(x: &'a T)->&'static T {
    unsafe { std::mem::transmute(x) }
}
Aside

@s3bk: Looking back at your original solution, you obviously understad this; I wrote this more to defend against others misinterpreting your statement.

Sorry for being a bit sloppy. Edited for better precision.

1 Like

(A slight aside) I have gained a preference for making unsafe blocks "self-contained". In this case, the transmute is (asserted as) safe because of the clear() but, since the clear() isn't in the unsafe block, it might get deleted/moved/copied elsewhere which would still compile but would now be wildly unsound. Instead I would write this as:

fn reuse_vec<'a, 'b, T>(mut v: Vec<&'a T>) -> Vec<&'b T> {
  unsafe {
    v.clear();
    std::mem::transmute(v)
  }
}

Since then (hopefully) no-one would then consider those two lines as independent. I've noticed quite a few libraries falling foul of this, and I've recently been bitten at work by someone copying one unsafe block from a function when its safety depended on a second unsafe block earlier that they hadn't copied... took a while to track that down :slight_smile:

Whether this function is actually UB or not, I'll leave to someone else's better judgement...

To follow that strictly, the entire function would need to be unsafe, as it depents on the return type and v being a Vec

Yes, as far as I can see, the reuse_vec function is perfectly safe to use, and does not need to be marked unsafe. The return type being a vector is part of the signature, so callers cannot make it be something else.

Exactly what @alice said, the function signature being written in safe rust means that the function takes a valid/safe Vec of references and returns a valid/safe Vec of references. The inner unsafely doesn't leak past the assumption that the input was valid.

Yes, you could create a Vec with some other unsafe code (e.g. let v = from_raw_parts(std::ptr::null(), 10, 10)) but that's unsafe and you can't (read, shouldn't) leave unsafe blocks without ensuring all live objects are safe. It's UB to call this function with an invalid Vec, in the same way that it's UB to call any other function on an invalid anything.