What's the type of a iterator over a Vec?

Ref:

Usually, you don't have to write out the full type of an iterator. But here, I need to.

This won't compile, because

fn fetch_next_field(pos: &mut Vec::Iter<'_, u8>) -> Option<String> 

which yields the error message

error[E0107]: missing generics for struct `Vec`
help: add missing generic argument
|
| fn fetch_next_field(pos: &mut Vec<T>::Iter<'_, u8>) -> Option<String>{
|                                  +++

Which seems reasonable enough. But if I put u8 in for T, the error message changes to

|
2 | fn fetch_next_field(pos: &mut Vec::Iter<', u8>) -> Option{
| ^^^^^^^^^^^^^^^^^^^^^
|
help: if there were a trait named Example with associated type Iter implemented for Vec<u8>, you could use the fully-qualified path
|
2 - fn fetch_next_field(pos: &mut Vec::Iter<'
, u8>) -> Option{
2 + fn fetch_next_field(pos: &mut <Vec as Example>::Iter<'_, u8>) -> Option{
|

as if Vec doesn't implement Iter.

What is the correct type for that iterator?

Vec::iter returns std::slice::Iter:

use std::slice::Iter;

///  Decoder for string serialization format similar to that used with CGI/FCGI.
fn fetch_next_field(pos: &mut Iter<'_, u8>) -> Option<String>{
    if let Some(cnt) = pos.next() {
        println!("cnt = {}", cnt);
        let b: Vec<u8> = pos.take(*cnt as usize).copied().collect(); // next cnt bytes or bust
        Some(String::from_utf8_lossy(&b).to_string())
    } else {
        return None
    }
}

/// Test program
fn main() {
    //  Two fields, with a length byte
    let bytes: [u8;7] = [3,101,102,103, 2, 201, 202];
    let v = bytes.to_vec();
    let mut iv = v.iter();
    //  Fetch all the fields
    while let Some(s) = fetch_next_field(&mut iv) {
        println!("Field: {}", s);
    }
}

Playground.

I'd say making the function generic would be a more commonly applied solution though:

///  Decoder for string serialization format similar to that used with CGI/FCGI.
fn fetch_next_field<'a>(mut pos: impl Iterator<Item=&'a u8>) -> Option<String>{
    if let Some(cnt) = pos.next() {
        println!("cnt = {}", cnt);
        let b: Vec<u8> = pos.take(*cnt as usize).copied().collect(); // next cnt bytes or bust
        Some(String::from_utf8_lossy(&b).to_string())
    } else {
        return None
    }
}

/// Test program
fn main() {
    //  Two fields, with a length byte
    let bytes: [u8;7] = [3,101,102,103, 2, 201, 202];
    let v = bytes.to_vec();
    let mut iv = v.iter();
    //  Fetch all the fields
    while let Some(s) = fetch_next_field(&mut iv) {
        println!("Field: {}", s);
    }
}

Playground.

3 Likes

Types in Rust cannot (currently) have inherent associated types, only via a trait impl. And there’s no trait that has an Iter associated type (having an iter method is just a convention). For consuming iterators there’s Vec<_>::IntoIter via IntoIterator. For non-consuming ones I guess there’s <&mut<&Vec<_>>::IntoIter (which is what Vec::iter returns) but it’s not exactly obvious what it does.

Correction - there's <&mut Vec<_> as IntoIterator>::IntoIter, which is returned by Vec::iter_mut, and there's <&Vec<_> as IntoIterator>::IntoIter, which is returned by Vec::iter.

1 Like

Indeed. Not sure where I got that mut from.

Thanks.

Just read that "take", despite its name, does not advance the original iterator. It creates a new iterator. Is there something else that does the equivalent of N ".next()" calls?

I'm writing this stuff the long way. This is legit but seems ugly.

/// Fetch one encoded value.
/// 0..127 is one byte.
/// If the first byte is larger than 127, fetch 3 more bytes and convert to a usize
fn fetch_field_length<'a>(mut pos: impl Iterator<Item=&'a u8>) -> Result<Option<usize>, Error> {
    if let Some(b0) = pos.next() {
        if *b0 > 127 {
            //  Fetch 3 more bytes
            let b1 = pos.next().ok_or_else(|| anyhow!("EOF reading multi-byte param length"))?;
            let b2 = pos.next().ok_or_else(|| anyhow!("EOF reading multi-byte param length"))?;
            let b3 = pos.next().ok_or_else(|| anyhow!("EOF reading multi-byte param length"))?;
            //  Compute length per spec
            Ok(Some(
                (((*b3 & 0x7f) as usize) << 24) + ((*b2 as usize) << 16) + ((*b1 as usize) << 8) + *b0 as usize
            ))
        } else {
            Ok(Some(*b0 as usize))
        }
    } else {
        Ok(None)    // EOF
    }
}

/// Fetch FCGI param field of requested length. Read N bytes, convert to UTF-8. Error if bad UTF-8.
fn fetch_field<'a>(cnt: usize, mut pos: impl Iterator<Item=&'a u8>) -> Result<String, Error> {
    let mut b = Vec::with_capacity(cnt);
    for _ in 0..cnt {
        let ch = pos.next().ok_or_else(|| anyhow!("EOF reading param field"))?;
        b.push(*ch);
    }
    Ok(String::from_utf8(b)?.to_string())
}

advance_by(n) is available in nightly but not yet stabilized.

1 Like

advance_by is a skip, not a bulk read.

Using map_while and enumerate might work, but it's uglier than what I've got.

The use case for this is mostly for old protocols that use length/value data items. So it's kind of niche.

Iterator::next_chunk should work (unstable).

An alternative (needed if you don't know N at const time) is passing &mut it to Itertools::chunks: it.by_ref().chunks(n).into_iter().next().unwrap().collect().

Edit: in that case, use iter.take(N).collect() (.by_ref() if necessary).

For constant N without allocating on stable, use Itertools::next_array::<N>().[1]


  1. or next_tuple if it makes more sense ↩︎

1 Like

That's .take(N).collect():

/// Fetch FCGI param field of requested length. Read N bytes, convert to UTF-8. Error if bad UTF-8.
fn fetch_field<'a>(cnt: usize, pos: impl Iterator<Item=&'a u8>) -> Result<String, anyhow::Error> {
    let b: Vec<u8> = pos.map(|&c| c).take(cnt).collect();
    anyhow::ensure!(b.len() == cnt, "EOF reading param field"); 
    Ok(String::from_utf8(b)?.to_string())
}
3 Likes

Aside: This appears suspicious; did you mean for b0 to be the high byte instead?.

Assuming that's the case, I'd probably refactor to something like this:

use anyhow::{self, Context};

fn fetch_field_length_or_eof<'a>(pos: impl Iterator<Item=&'a u8>) -> Result<Option<usize>, anyhow::Error> {
    let mut pos = pos.peekable();
    if pos.peek().is_none() { return Ok(None); }
    Ok(Some(fetch_field_length(pos).context("EOF reading multi-byte param length")?))
}

/// Fetch one encoded value.
/// 0..127 is one byte.
/// If the first byte is larger than 127, fetch 3 more bytes and convert to a usize
fn fetch_field_length<'a>(pos: impl Iterator<Item=&'a u8>) -> Option<usize> {
    let mut pos = pos.copied();
    let b0 = pos.next()?;
    if b0 < 128 {
        Some(b0 as usize);
    } else {
        Some(u32::from_be_bytes([b0 & 0x7f, pos.next()?, pos.next()?, pos.next()?]) as usize)
    }
}

take, like its opposite skip, is an iterator adapter, so it's lazy like all the others. The terms "take" and "skip" (and their "_while" variants) are fairly standard names for these adapters.

Slice iterators in particular could plausibly have additional methods, though, that return a subslice and advance the iterator.

1 Like

For bytes I tend to find wrapping the slice or Vec in my own type and using split_at() and friends directly more ergonomic currently. Mileage may vary, I suppose?

1 Like

Oops, sorry - I meant to link to the unstable next_chunk, but got interrupted by something else while I was pulling up the std docs and mixed up what I was doing when I came back :slight_smile:

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.