Convert Vec<T> to Vec<(T, T)>

Assuming that the last T is to be discarded if the length of the original Vec is odd, can this conversion be done (internally) by simply dividing the length and capacity of the original Vec by 2 (integer division)?

No, because:

  1. The Vec::from_raw_parts function requires that "T needs to have the same size and alignment as what ptr was allocated with." Your element would be twice the size.
  2. The layout of tuples is unspecified.

One thing that is possible is to convert between &[T] and &[[T; 2]].

1 Like

You can do:

fn into_pairs<T> (v: Vec<T>)
  -> Vec<[T; 2]>
{
    let b: Box<[T]> = v.into_boxed_slice();
    let len = b.len();
    assert!(len % 2 == 0);
    let ptr = Box::into_raw(b) as *mut T;
    let b: Box<[[T; 2]]> = unsafe {
        Box::from_raw(::core::ptr::slice_from_raw_parts_mut(
            ptr.cast::<[T; 2]>(),
            len / 2,
        ))
    };
    b.into()
}

But unless you are dealing with huge Vecs, you could also remain safe and use Itertools:

fn into_pairs<T> (v: Vec<T>)
  -> Vec<[T; 2]>
{
    use ::itertools::Itertools;

    v   .into_iter()
        .tuples()
        .map(|(a, b)| [a, b]) // comment this to get `Vec<(T, T)>`
        .collect::<Vec<_>>()
}
2 Likes

Huh, it's an interesting point that going through boxes have weaker safety requirements. I didn't realize that you could do that.

Yeah, and it's a bit silly, imho, since it may lead to reallocs just to avoid an UB that can't be exploited in practice. The only advantage of the current rules is that they are simpler than the following ones:

Hypothetical weaker rules that could end up being used

"Transmuting" (through .{into,from}_raw_parts()) a Vec<T> to a Vec<U> is well-defined iff:

  • [T; old_cap] and [U; new_cap] have the same Layout. That is,

    • size_of::<T>() * old_cap == size_of::<U>() * new_cap
    • align_of::<T>() == align_of::<U>()
  • Transmuting (with possible truncation) a

    SequentiallyContiguous<
        [T; old_len], [MaybeUninit<T>; old_cap - old_len],
    >
    

    to a [U; new_len] is well-defined.

    • where SequentiallyContiguous is defined as:

      #[repr(C)]
      struct SequentiallyContiguous<_0, _1>(_0, _1);
      
    • When size_of::<T> * old_len == size_of::<U>() * new_len
      (which would be the case here with U = [T; 2], and new_len = old_len / 2),
      it simply means that transmuting a [T; old_len] to a [U; new_len] is well-defined (which it is in the example).

2 Likes

I think you can avoid the potential reallocation in into_boxed_slice if you first transmute (with from_raw_parts) the Vec<T> into a Vec<MaybeUninit<T>> with len == capacity. And if into_boxed_slice guarantees no reallocations if len == capacity then you get that the actual rules for from_raw_parts can't be more restricting than the weaker rules you proposed.

pub fn into_pairs<T>(mut v: Vec<T>) -> Vec<[T; 2]> {
    use core::mem::MaybeUninit;
    
    assert!(v.len() % 2 == 0);
    assert!(v.capacity() % 2 == 0);
    
    let ptr = v.as_mut_ptr() as *mut MaybeUninit<T>;
    let len = v.len();
    let cap = v.capacity();
    core::mem::forget(v);
    
    let v: Vec<MaybeUninit<T>> = unsafe { Vec::from_raw_parts(ptr, cap, cap) };
    let b: Box<[MaybeUninit<T>]> = v.into_boxed_slice();
    
    let ptr = Box::into_raw(b) as *mut [MaybeUninit<T>; 2];
    let ptr = core::ptr::slice_from_raw_parts_mut(ptr, cap / 2);
    let b: Box<[[MaybeUninit<T>; 2]]> = unsafe { Box::from_raw(ptr) };
    
    let mut v: Vec<[MaybeUninit<T>; 2]> = b.into();
    let ptr = v.as_mut_ptr() as *mut [T; 2];
    core::mem::forget(v);
    
    let v: Vec<[T; 2]> = unsafe { Vec::from_raw_parts(ptr, len / 2, cap / 2) };
    v
}
2 Likes

Thank you!

Does that mean that the first solution runs in constant and the second in linear time? If so, could the second solution someday require only constant time?

The first also takes linear time if the Vec is not full because otherwise into_boxed_slice has to reallocate

It would be hard, but somehow with in-place collection (which is already a thing, but it's a bit limited to types that have the same size and alignment) and LLVM optimizing the resulting loop to a noop