Why doesn't clone get optimised out here?

use rand::Rng;
use std::time::SystemTime;


fn main() {
    let mut rng = rand::rng();
    
    
    let a: Vec<i64> = (0..1000000).into_iter().map(|_i| rng.random_range(0..2i64.pow(32))).collect();
   
    
    let now = SystemTime::now();
    let s1: i64 = a.iter().sum();
    let t1 = now.elapsed().unwrap();
    println!("s1: {:?}, t1: {t1:?}", s1);
    
    let now = SystemTime::now();
    // Why does this clone not get optimised out?
    let s2: i64 = a.clone().iter().sum();
    let t2 = now.elapsed().unwrap();
    println!("s2: {:?}, t2: {t2:?}", s2);
    
    let now = SystemTime::now();
    let s3: i64 = a.iter().cloned().sum();
    let t3 = now.elapsed().unwrap();
    println!("s3: {:?}, t3: {t3:?}", s3);
}

(Playground)

Shouldn't the compiler be smart enough to optimize out the a.clone() in the second timing section above?
The timing output seems to show that it's cloning the entire vec and then doing the sum reduction, rather than correctly avoiding cloning the entire vec like section 3.

With no mutable refs taken shouldn't this be a trivial optimization?

You know that a.clone() produces a value that is functionally identical to a for how it is going to be used. The optimizer in LLVM does not do this kind of high-level reasoning; it does not attribute any special meaning to .clone(). It sees a series of operations including a memory allocation and the construction of another Vec with a different data pointer than a's, and the execution of an iterator using that different pointer.

It would be possible for the optimizer to figure out that the allocation can be discarded, but not easy.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.