How to “zip” two slices efficiently

bluss · August 17, 2015, 1:40pm

For some reason, the float version just doesn't optimize like the integer version does.

Is this a "-ffast-math" type of problem? Llvm doesn't want to do the vectorization since it may change the floating point result slightly(?)

Edit: It's a -ffast-math type of problem, documented here

test zipdot_f32_checked_counted_loop   ... bench:       1,347 ns/iter (+/- 664)
test zipdot_f32_default_zip            ... bench:       1,392 ns/iter (+/- 13)
test zipdot_f32_unchecked_counted_loop ... bench:       1,343 ns/iter (+/- 371)
test zipdot_f32_zipslices              ... bench:       1,342 ns/iter (+/- 466)
test zipdot_f32_ziptrusted             ... bench:       1,342 ns/iter (+/- 387)
test zipdot_i32_checked_counted_loop   ... bench:         380 ns/iter (+/- 113)
test zipdot_i32_default_zip            ... bench:       1,401 ns/iter (+/- 27)
test zipdot_i32_unchecked_counted_loop ... bench:         308 ns/iter (+/- 154)
test zipdot_i32_zipslices              ... bench:         380 ns/iter (+/- 134)
test zipdot_i32_ziptrusted             ... bench:         301 ns/iter (+/- 148)

Topic		Replies	Views
Zip two finite iterators and pad the shorter one help	5	1096	January 12, 2023
Can I zip multiple iterators flattened? help	3	1855	February 23, 2022
How to iterate through two arrays at once?	20	40920	July 3, 2022
Performance difference between iterator zip and skip order	7	2092	January 12, 2023
Writing a (parallel) zip for a const generic number of (parallel) iterators help	2	522	July 21, 2021

How to “zip” two slices efficiently

Related Topics