In my knowledge, Box<dyn Fn()>
is a fat point, each time it is called, it looks up its vtable. So it would be slower when Box<dyn Fn()>
is callled comparing to fn()
is called.
But in this test, I fould the two methods have the same performance.
The following is the code:
pub struct ForTest(pub Vec<f32>);
pub trait DynMethod {
fn iter_func_dyn<'a>(&'a self) -> Box<dyn Fn(usize) -> bool + 'a>;
fn iter_all_dyn(&self) -> Vec<bool>;
}
pub trait FuncMethod {
fn iter_func_func(&self, i: usize) -> bool;
fn iter_all_func(&self) -> Vec<bool>;
}
impl DynMethod for ForTest {
fn iter_func_dyn<'a>(&'a self) -> Box<dyn Fn(usize) -> bool + 'a> {
let data = &self.0;
Box::new(move |i| data[i] > 10.)
}
fn iter_all_dyn(&self) -> Vec<bool> {
let f = self.iter_func_dyn();
let mut res = vec![];
for i in 0..self.0.len() {
res.push(f(i));
}
res
}
}
impl FuncMethod for ForTest {
fn iter_func_func(&self, i: usize) -> bool {
self.0[i] > 10.
}
fn iter_all_func(&self) -> Vec<bool> {
let mut res = vec![];
for i in 0..self.0.len() {
res.push(self.iter_func_func(i))
}
res
}
}
macro_rules! t {
($($tt:tt)+) => {
let timer = std::time::Instant::now();
$($tt)+;
println!("{:?}", timer.elapsed());
}
}
fn main() {
let data = vec![11f32; 1_000_000];
let for_test = ForTest(data);
t!(for_test.iter_all_dyn());
t!(for_test.iter_all_func());
}
34.638314ms
30.731192ms
I thought compiler have done something to optimize the code. If it is true, in what situation the compiler can do it, and in what situation the compiler can not?