You can use iterators with for loop, you just avoid indexes. Since you are working with chunks of BLOCK_SIZE elements, you should take a look at the chunks
method of the slice (windows
method is also usefull when you are dealing with chunks of elements).
I have swapped the for k in
and for i in
loop, to keep the same chunk of ij
elements for each iteration of the most outter loop. Then it is possible to rewrite it like this:
use std::ops::Add;
use std::cmp::Ord;
use std::default::Default;
pub fn array_sums_<T: Default+Copy+Add<Output=T>+Ord>(block_size: usize, ik: &[T], kj: &[T], ij: &mut[T]) {
if block_size == 0 {
return ();
}
if(ik.len() != block_size*block_size) || (ik.len() != kj.len()) || (ik.len() != ij.len()) {
return ();
}
for (ij, ik) in ij.chunks_exact_mut(block_size).zip(ik.chunks_exact(block_size)) {
for (&ik,kj) in ik.iter().zip(kj.chunks_exact(block_size)) {
for (ij,&kj) in ij.iter_mut().zip(kj.iter()) {
*ij = std::cmp::min(*ij, ik+kj);
}
}
}
}
The first loop for (ij, ik) in ij.chunks_exact_mut(block_size).zip(ik.chunks_exact(block_size))
divides ij
and ik
into chunks of the same size and zip
those chunks together (the first chunk of ij
will be associated with the first chunk of ik
, then the second chunk of ij
will be associated with the second chunk of ik
, etc.). It will generate an iterator of block_size
(because the len of the slices are the squared of block_size
) pairs of chunks.
Example
For instance if BLOCK_SIZE is equal to 4 and we have the following arrays the first loop will generate the following iterator:
Input:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ij |
1 |
2 |
3 |
4 |
|
5 |
6 |
7 |
8 |
|
9 |
10 |
11 |
12 |
|
13 |
14 |
15 |
16 |
ik |
1 |
2 |
3 |
4 |
|
5 |
6 |
7 |
8 |
|
9 |
10 |
11 |
12 |
|
13 |
14 |
15 |
16 |
kj |
1 |
2 |
3 |
4 |
|
5 |
6 |
7 |
8 |
|
9 |
10 |
11 |
12 |
|
13 |
14 |
15 |
16 |
Iterator: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The first loop will iterate of an iterator of 4 pairs of chunks (see below). The elements of the pair are the chunks not the values inside. The whole ik row is an element and the other one is the ij row. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
--- |
--- |
--- |
--- |
--- |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ik |
1 |
2 |
3 |
4 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ij |
1 |
2 |
3 |
4 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ik |
9 |
10 |
11 |
12 |
ij |
9 |
10 |
11 |
12 |
|
|
|
|
|
ik |
13 |
14 |
15 |
16 |
ij |
13 |
14 |
15 |
16 |
The second loop for (&ik,kj) in ik.iter().zip(kj.chunks_exact(block_size))
iters over all the elements of the current chunk of ik
and zip
each of them with a chunk of kj
.
Example
For this loop we will only consider the first element of the iterator generated by the most outter loop.
Input:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ij |
1 |
2 |
3 |
4 |
|
5 |
6 |
7 |
8 |
|
9 |
10 |
11 |
12 |
|
13 |
14 |
15 |
16 |
ik |
1 |
2 |
3 |
4 |
|
5 |
6 |
7 |
8 |
|
9 |
10 |
11 |
12 |
|
13 |
14 |
15 |
16 |
kj |
1 |
2 |
3 |
4 |
|
5 |
6 |
7 |
8 |
|
9 |
10 |
11 |
12 |
|
13 |
14 |
15 |
16 |
First element of the outter iterator: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
--- |
--- |
--- |
--- |
--- |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ik |
1 |
2 |
3 |
4 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ij |
1 |
2 |
3 |
4 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
For this one we will associate to each value of ik a chunk of kj (see below). In this case the elements of the iterator are pairs of one value (ik) and chunk (ij). For next elements of the outter iterator the value of ik would have been (5,6,7,8 | 9,10,11,12 | 13,14,15,16). The kj are always the same for this one.
The last loop zip all elements of ij
and kj
and update the ij
with the current value of ik
.
Example
For this loop we will only consider the first element of the iterator generated by the most outter loop.
Input:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ij |
1 |
2 |
3 |
4 |
|
5 |
6 |
7 |
8 |
|
9 |
10 |
11 |
12 |
|
13 |
14 |
15 |
16 |
ik |
1 |
2 |
3 |
4 |
|
5 |
6 |
7 |
8 |
|
9 |
10 |
11 |
12 |
|
13 |
14 |
15 |
16 |
kj |
1 |
2 |
3 |
4 |
|
5 |
6 |
7 |
8 |
|
9 |
10 |
11 |
12 |
|
13 |
14 |
15 |
16 |
First element of the outter iterator: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
--- |
--- |
--- |
--- |
--- |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ik |
1 |
2 |
3 |
4 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ij |
1 |
2 |
3 |
4 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Second iterator:
Last iterators:
For ik=1: zip(ij,kj)=[(1,1),(2,2),(3,3),(4,4)]
For ik=2: zip(ij,kj)=[(1,5),(2,6),(3,7),(4,8)]
For ik=3: zip(ij,kj)=[(1,9),(2,10),(3,11),(4,12)]
For ik=4: zip(ij,kj)=[(1,13),(2,14),(3,15),(4,16)]
So ij(1) = min(ij(1), ik(1)+kj(1), ik(2)+kj(5), ik(3)+kj(9), ik(4)+kj(13))
So ij(2) = min(ij(2), ik(1)+kj(2), ik(2)+kj(6), ik(3)+kj(10), ik(4)+kj(14))
So ij(3) = min(ij(3), ik(1)+kj(3), ik(2)+kj(7), ik(3)+kj(11), ik(4)+kj(15))
So ij(4) = min(ij(4), ik(1)+kj(4), ik(2)+kj(8), ik(3)+kj(12), ik(4)+kj(16))