Force enumerate() to u64

Hi,

I've got a simple for statement:

for (i, n) in my_iterator().enumerate() {
    ...
}

i is usize; I tried enumerate::<u64>(), but it is not so simple. I did some research, found only some ad hoc solutions, none of them relevant to the enumerate() case. Is it really so difficult?

enumerate gives usize as an index because it's the pointer size* of your machine. If your program is for 64 bits machines, you can just do :

let i = i as u64;

at the start of your loop, usize and u64 are probably synonyms.

* reality is different

Sure. But I have in mind the case of a 32-bit architecture. Suppose my (lazy) iterator has a real chance to break 65535 limit? If this happens, type cast is worse than useless.

There is no way to make enumerate return u64, but you can use zip

for (i, n) in my_iterator().zip(0u64..) {
    // ...
}

I guess this might be one of the "ad hoc solutions" you mentioned, but it is the best way to do it (afaik).

6 Likes

Can it safely clear the 65535 limit on a 32-bit architecture?

In that case why not use the usize as is ? (btw 65535 is 2^16 not 32)

Yes

65535 is the 16-bit limit. The 32-bit limit is 4,294,967,295, but it can clear both

4 Likes

Oh, my bad...

it's fine, no worries :slight_smile:

Yes, but the question is still interesting.

The answer is in the documentation for enumerate: https://doc.rust-lang.org/stable/std/iter/trait.Iterator.html#method.enumerate

enumerate() keeps its count as a usize . If you want to count by a different sized integer, the zip function provides similar functionality.

1 Like

If you have a 32-bit architecture, then usize is 32 bits (except for weird archs), and so casting usize to u64 should never impose any loss of information.

If the number of elements in your iterator exceeds the addressable memory usize::MAX, then enumerate() will misbehave (wrap around or panic, I don't remember) anyway, and so you've got a much worse problem to deal with than a useless cast.

1 Like

Iterators don't always consume much space, so wrapping a u32 is not necessarily a problem or even unusual in a 32 bit system.

Consider:

for (timepoint, reading) in std::iter::repeat_with(read_sensor).enumerate() {
     println!("At timepoint {}: got reading {}", timepoint, reading);
}

OP: It's pretty easy to make your own enumerator that will do this:

pub struct Enumerate64<I> {
    iter: I,
    counter: u64,
}
impl<I, T> Iterator for Enumerate64<I> 
where
    I: Iterator<Item = T>,
{
    type Item = (u64, T);
    fn next(&mut self) -> Option<Self::Item> {
        let prev = self.counter;
        // Wrapping just to be absolutely sure it will never panic
        // self.counter += 1 is probably fine
        self.counter = self.counter.wrapping_add(1); 
        Some((prev, self.iter.next()?))
    }
}

fn enumerate<I>(iter: I) -> Enumerate64<I> {
    Enumerate64 {
        iter,
        counter: 0,
    }
}

It's not about space. An iterator that produces so
many elements that it overflows usize is problematic whether or not it tries to allocate 4 billion elements.

I think that iterators potentially longer than 32-bit usize are quite practical. It only takes a few seconds to count that many things if you can process them quickly.

People above seem to have missed the solution that @Cyborus04 and @scottmcm posted, so here it is more explicitly:

for (i, n) in (0u64..).zip(my_iterator()) {
}
3 Likes

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.