What would be the difference between this trait and Iterator<Item=&T>
(or the same with &[T]
)?
The answer is, as you point out:
Taking an Iterator
is more general, but gives you, the implementor, less power. Taking a slice is more specific, but gives you more power, e.g. the ability to perform efficient random access.
I want to write a general function, but I also want efficient random access. So maybe I'm trying to have my cake and eat it too?
I think the exact opposite is true: both can be idiomatic, it just depends on how you want the interface to be used.
I was a little unclear... in fact I think I agree with this. Slices and iterators are both features of the language, and each is appropriate in some situations. I wanted to say that trying to use only one of them all the time, even when the other one would be a better fit, would be a bad idea.
The problem is that within the function, because of the nature of the algorithm I'm using, I want to use both slices and iterators: slices for random access, iterators for cases like graphemes. It's good that slice::Iter::nth()
is efficient, but my reason for wanting iterators is to accommodate iterators like Graphemes
which step over a variable number of bytes with each next()
. I imagine that for Graphemes
, almost certainly nth()
will be slow, because there's no easy way to convert the instruction "jump 10 graphemes ahead" into a byte offset.
Initially, I was trying to be concise, but I think I erred on the side of giving too little information. In the name of clarity, I've decide to risk giving too much information instead. Hopefully it won't be necessary to read through everything to understand what I'm trying to do, but just skip-read enough to get a gist. So, for reference, below are all the details.
I'm basically adapting/copying from C code which I found from the Wikipedia article on diff, and the algorithm in the C code is itself based on this research paper. In the paper, Figure 2 on page 6 contains the relevant pseudocode. Fig. 1 on page 3 and Fig. 3 on page 7 together give visual illustrations of the strategy the code is using.
I was reluctant to post chunks of my own version of the code out of embarrassment about the possible poor quality but maybe it would be clearer to just paste the first half of the loop in the relevant function for context. The important thing to note is how the splice index variable x
gets stored in the memoization array fv[]
and then retrieved - this is the part that needs random access. The (very C-like) while loop is the only place where the contents of the two sequences a
and b
being compared are actually examined.
let mut fv = [mid * 2 + 1; 0];
let mut rv = [mid * 2 + 1; 0];
fv[1 + mid] = 0;
rv[delta - 1 + mid] = n;
for d in 0..=mid {
let mut x: usize;
let mut y: usize;
for k in (0-d..=d).rev().step_by(2) {
let km = k + mid;
// Find the end of the furthest reaching
// forward D-path in diagonal k.
// Greedy algorithm in Figure 2
x = if k == 0-d || k != d && fv[km - 1] < fv[km + 1] {
fv[km + 1]
} else {
fv[km - 1] + 1
}
y = x - k;
// Start point of middle snake
let startx = x;
let starty = y;
while x < n && y < m && a[x] == b[y] {
x += 1; y += 1;
}
fv[km] = x; // end Figure 2
if odd && k >= delta - (d - 1) && k <= delta + (d - 1) {
// Check for overlap with reverse path
if x >= rv[km] {
return MiddleSnake {
x: startx,
y: starty,
u: x,
v: y,
d: 2 * d - 1
};
}
// for loop continues...
I hope this makes things clearer. At the moment it looks like what I'm trying to do may not be possible, or at least not advisable. Is there any way to have random access but also account for complications like variable-length graphemes?