I've studied the resulting asm of few alternative solutions:
#![allow(private_no_mangle_fns)]
#[no_mangle]
#[inline(never)]
fn flatten1(data: &[(i32, i32)]) -> Vec<i32> {
data
.iter()
.fold(Vec::with_capacity(data.len() * 2),
|mut acc, p| { acc.extend(&[p.0, p.1]); acc })
}
#[no_mangle]
#[inline(never)]
fn flatten2(data: &[(i32, i32)]) -> Vec<i32> {
data
.iter()
.fold(Vec::with_capacity(data.len() * 2),
|mut acc, p| { acc.push(p.0); acc.push(p.1); acc })
}
#[no_mangle]
#[inline(never)]
fn flatten3(data: &[(i32, i32)]) -> Vec<i32> {
let mut result = Vec::with_capacity(data.len() * 2);
for &(a, b) in data {
result.push(a);
result.push(b);
}
result
}
#[no_mangle]
#[inline(never)]
fn flatten4(data: &[(i32, i32)]) -> Vec<i32> {
let mut result = vec![0; data.len() * 2];
for (i, &(a, b)) in data.iter().enumerate() {
result[i * 2 + 0] = a;
result[i * 2 + 1] = b;
}
result
}
#[no_mangle]
#[inline(never)]
fn flatten5(data: &[(i32, i32)]) -> Vec<i32> {
let len = data.len() * 2;
let mut result = Vec::with_capacity(len);
unsafe {
for (i, &(a, b)) in data.iter().enumerate() {
*result.get_unchecked_mut(i * 2) = a;
*result.get_unchecked_mut(i * 2 + 1) = b;
}
result.set_len(len);
}
result
}
#[no_mangle]
#[inline(never)]
fn flatten6(data: &[(i32, i32)]) -> Vec<i32> {
let mut result = data.to_vec();
unsafe {
result.set_len(data.len() * 2);
std::mem::transmute(result)
}
}
fn main() {
for flatten in &[flatten1, flatten2, flatten3,
flatten4, flatten5, flatten6] {
println!("{:?}", flatten(&[(1,2), (3,4), (5,6)]));
println!("{:?}", flatten(&[(1,2), (3,4)]));
}
}
For a system programmer flatten6 is probably the most reasonable solution, it's short, fast (it's equivalent to a malloc + memcpy), and unsafe (but the types are inferred, this avoids some bugs).
But for its signature I'd like to write some like this (that's allowed in D language), that is less bug-prone:
fn flatten6(data: &[(i32, i32)]) -> Vec<typeof(data[0].0)> {
flatten4 looks efficient but the compiler is fails to see that a vec of pairs of length N is fitting inside a 2*N array, so both array accesses have bound tests.
flatten5 avoids that problem, but it's a bad idea to write code like that for i32 data.
The asm of flatten1 seems a bit worse than flatten2 that has two simpler push.