Is giving Iter<_,u8> to a function more efficient than &[u8]?

Hi everyone,
I'm working on a crate to extract informations from midi file ( which is an hex file). I made a first version of the crate where I give an &[u8] (representing data in midi file) to different functions, one extracting header, another tracks.

Every time I call a different function I shift the starting point of the slice:

pub fn open(path: &str) -> std::io::Result<()> {
    let data = fs::read(path)?;

    let _header = Header::extract(&data);
    let _track = Track::extract(&data[14..]); //Header length is always 14 bytes

    drop(data);

    Ok(())
}

I was wondering if giving Iter< _, u8> to function would be more efficient than giving &[u8], as data are only read once.

1 Like

There's no difference. A slice::Iter is morally itself just like a slice, a pointer–length pair. The fact that iterators have a different and more restricted API than slices does not matter if you don't need to use slice APIs such as indexing.

A function taking an iterator, especially an iterator of a concrete type, is not idiomatic in Rust. If a function can accept anything iterable, it should take an impl IntoIterator<Item=T> instead. That's the most convenient option from the caller's point of view.

i think iterators that have ownership of their data can be more performant in some cases but operators that access data through a borrow don't have any advanage over passing &data and having the called function generate the iterator when it reads

That's not entirely accurate, a slice::Iter is represented as a pair of pointers (no guarantees obviously), which can have an impact on performance (in the picoseconds range).

1 Like

99% of the time, if you're taking a slice to read, you should take a &[T]. You can always get an iterator from it later if needed. And it's much easier on the caller.


Nuance:

The reason slice::Iter<'a, T> isn't just a &'a [T] internally is for a micro-optimization in Iterator::next where x = &x[1..] needs to update two words (the pointer goes forward and the length goes down by one) whereas slice::Iter::next needs to only move one pointer (as the end pointer stays in the same place).

So if you're taking &mut slice::Iter<'_, _> and reading only small amounts from it at a time and you're doing minimal extra work and you think it might not be inlined all the way to the overall loop, then it might be worth doing instead of just a &mut &[T].

But really that's quite rare.

6 Likes

Thanks, that's a good explanation !!

Technically, a slice is easier to optimize, since all of its data is contiguously placed in memory. The compiler can e.g. use SIMD operations, or exploit data locality in some other way. It's also a smaller structure than an iterator, and has simpler operations.

Whether all of that matters in practice, I don't know. Iter::next is a very simple function which is likely to be inlined and optimized to the same level. But it's still a few extra steps, so I guess there will be edge cases where passing Iter won't optimize as well when inlining fails for some obscure reasons.

In any case, I wouldn't use Iter directly. Either just use a slice, which is easier technically and conceptually, or accept arbitrary impl IntoIter when you really intend to support non-contiguous and possibly streaming data. For &[u8] specifically, I struggle to think where that complication would be worthwhile. It's easier to stream the data via some buffer, or use something like a linked list of buffers.

This is about slice::Iter<'a, T> vs &'a [T]. It's not about impl Iterator<Item = &'a T>.

1 Like