Convert Vec<T> to Vec<(T, T)>

lambdakappatheta · June 3, 2021, 2:40pm

Assuming that the last T is to be discarded if the length of the original Vec is odd, can this conversion be done (internally) by simply dividing the length and capacity of the original Vec by 2 (integer division)?

alice · June 3, 2021, 2:43pm

No, because:

The Vec::from_raw_parts function requires that "T needs to have the same size and alignment as what ptr was allocated with." Your element would be twice the size.
The layout of tuples is unspecified.

One thing that is possible is to convert between &[T] and &[[T; 2]].

Yandros · June 3, 2021, 3:14pm

You can do:

fn into_pairs<T> (v: Vec<T>)
  -> Vec<[T; 2]>
{
    let b: Box<[T]> = v.into_boxed_slice();
    let len = b.len();
    assert!(len % 2 == 0);
    let ptr = Box::into_raw(b) as *mut T;
    let b: Box<[[T; 2]]> = unsafe {
        Box::from_raw(::core::ptr::slice_from_raw_parts_mut(
            ptr.cast::<[T; 2]>(),
            len / 2,
        ))
    };
    b.into()
}

But unless you are dealing with huge Vecs, you could also remain safe and use Itertools:

fn into_pairs<T> (v: Vec<T>)
  -> Vec<[T; 2]>
{
    use ::itertools::Itertools;

    v   .into_iter()
        .tuples()
        .map(|(a, b)| [a, b]) // comment this to get `Vec<(T, T)>`
        .collect::<Vec<_>>()
}

alice · June 3, 2021, 3:18pm

Huh, it's an interesting point that going through boxes have weaker safety requirements. I didn't realize that you could do that.

Yandros · June 3, 2021, 3:30pm

Yeah, and it's a bit silly, imho, since it may lead to reallocs just to avoid an UB that can't be exploited in practice. The only advantage of the current rules is that they are simpler than the following ones:

Hypothetical weaker rules that could end up being used

"Transmuting" (through .{into,from}_raw_parts()) a Vec<T> to a Vec<U> is well-defined iff:

[T; old_cap] and [U; new_cap] have the same Layout. That is,
- size_of::<T>() * old_cap == size_of::<U>() * new_cap
- align_of::<T>() == align_of::<U>()
Transmuting (with possible truncation) a
```
SequentiallyContiguous<
    [T; old_len], [MaybeUninit<T>; old_cap - old_len],
>
```
to a [U; new_len] is well-defined.
- where SequentiallyContiguous is defined as:
```
#[repr(C)]
struct SequentiallyContiguous<_0, _1>(_0, _1);
```
- When size_of::<T> * old_len == size_of::<U>() * new_len
  (which would be the case here with U = [T; 2], and new_len = old_len / 2),
  it simply means that transmuting a [T; old_len] to a [U; new_len] is well-defined (which it is in the example).

SkiFire13 · June 3, 2021, 5:20pm

I think you can avoid the potential reallocation in into_boxed_slice if you first transmute (with from_raw_parts) the Vec<T> into a Vec<MaybeUninit<T>> with len == capacity. And if into_boxed_slice guarantees no reallocations if len == capacity then you get that the actual rules for from_raw_parts can't be more restricting than the weaker rules you proposed.

pub fn into_pairs<T>(mut v: Vec<T>) -> Vec<[T; 2]> {
    use core::mem::MaybeUninit;
    
    assert!(v.len() % 2 == 0);
    assert!(v.capacity() % 2 == 0);
    
    let ptr = v.as_mut_ptr() as *mut MaybeUninit<T>;
    let len = v.len();
    let cap = v.capacity();
    core::mem::forget(v);
    
    let v: Vec<MaybeUninit<T>> = unsafe { Vec::from_raw_parts(ptr, cap, cap) };
    let b: Box<[MaybeUninit<T>]> = v.into_boxed_slice();
    
    let ptr = Box::into_raw(b) as *mut [MaybeUninit<T>; 2];
    let ptr = core::ptr::slice_from_raw_parts_mut(ptr, cap / 2);
    let b: Box<[[MaybeUninit<T>; 2]]> = unsafe { Box::from_raw(ptr) };
    
    let mut v: Vec<[MaybeUninit<T>; 2]> = b.into();
    let ptr = v.as_mut_ptr() as *mut [T; 2];
    core::mem::forget(v);
    
    let v: Vec<[T; 2]> = unsafe { Vec::from_raw_parts(ptr, len / 2, cap / 2) };
    v
}

lambdakappatheta · June 3, 2021, 8:07pm

Thank you!

Does that mean that the first solution runs in constant and the second in linear time? If so, could the second solution someday require only constant time?

SkiFire13 · June 3, 2021, 8:18pm

The first also takes linear time if the Vec is not full because otherwise into_boxed_slice has to reallocate

It would be hard, but somehow with in-place collection (which is already a thing, but it's a bit limited to types that have the same size and alignment) and LLVM optimizing the resulting loop to a noop

system · September 1, 2021, 8:18pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Vec layout / transmute help	5	543	June 15, 2022
Least unsafe way to transmute Vec<T> to Vec<[T; N]> help	6	394	April 7, 2024
Conversion between Vec<[T; N]> and Vec<T>? help	25	971	July 29, 2021
Converting Vec<usize> into Vec<T> help	8	3097	February 5, 2020
Current META converting Vec<U> -> Vec<T> where	7	1259	March 28, 2023

Convert Vec<T> to Vec<(T, T)>

Hypothetical weaker rules that could end up being used

Related Topics