Which seems reasonable enough. But if I put u8 in for T, the error message changes to
|
2 | fn fetch_next_field(pos: &mut Vec::Iter<', u8>) -> Option{
| ^^^^^^^^^^^^^^^^^^^^^
|
help: if there were a trait named Example with associated type Iter implemented for Vec<u8>, you could use the fully-qualified path
|
2 - fn fetch_next_field(pos: &mut Vec::Iter<', u8>) -> Option{
2 + fn fetch_next_field(pos: &mut <Vec as Example>::Iter<'_, u8>) -> Option{
|
use std::slice::Iter;
/// Decoder for string serialization format similar to that used with CGI/FCGI.
fn fetch_next_field(pos: &mut Iter<'_, u8>) -> Option<String>{
if let Some(cnt) = pos.next() {
println!("cnt = {}", cnt);
let b: Vec<u8> = pos.take(*cnt as usize).copied().collect(); // next cnt bytes or bust
Some(String::from_utf8_lossy(&b).to_string())
} else {
return None
}
}
/// Test program
fn main() {
// Two fields, with a length byte
let bytes: [u8;7] = [3,101,102,103, 2, 201, 202];
let v = bytes.to_vec();
let mut iv = v.iter();
// Fetch all the fields
while let Some(s) = fetch_next_field(&mut iv) {
println!("Field: {}", s);
}
}
I'd say making the function generic would be a more commonly applied solution though:
/// Decoder for string serialization format similar to that used with CGI/FCGI.
fn fetch_next_field<'a>(mut pos: impl Iterator<Item=&'a u8>) -> Option<String>{
if let Some(cnt) = pos.next() {
println!("cnt = {}", cnt);
let b: Vec<u8> = pos.take(*cnt as usize).copied().collect(); // next cnt bytes or bust
Some(String::from_utf8_lossy(&b).to_string())
} else {
return None
}
}
/// Test program
fn main() {
// Two fields, with a length byte
let bytes: [u8;7] = [3,101,102,103, 2, 201, 202];
let v = bytes.to_vec();
let mut iv = v.iter();
// Fetch all the fields
while let Some(s) = fetch_next_field(&mut iv) {
println!("Field: {}", s);
}
}
Types in Rust cannot (currently) have inherent associated types, only via a trait impl. And there’s no trait that has an Iter associated type (having an iter method is just a convention). For consuming iterators there’s Vec<_>::IntoIter via IntoIterator. For non-consuming ones I guess there’s <&mut<&Vec<_>>::IntoIter (which is what Vec::iter returns) but it’s not exactly obvious what it does.
Correction - there's <&mut Vec<_> as IntoIterator>::IntoIter, which is returned by Vec::iter_mut, and there's <&Vec<_> as IntoIterator>::IntoIter, which is returned by Vec::iter.
Just read that "take", despite its name, does not advance the original iterator. It creates a new iterator. Is there something else that does the equivalent of N ".next()" calls?
I'm writing this stuff the long way. This is legit but seems ugly.
/// Fetch one encoded value.
/// 0..127 is one byte.
/// If the first byte is larger than 127, fetch 3 more bytes and convert to a usize
fn fetch_field_length<'a>(mut pos: impl Iterator<Item=&'a u8>) -> Result<Option<usize>, Error> {
if let Some(b0) = pos.next() {
if *b0 > 127 {
// Fetch 3 more bytes
let b1 = pos.next().ok_or_else(|| anyhow!("EOF reading multi-byte param length"))?;
let b2 = pos.next().ok_or_else(|| anyhow!("EOF reading multi-byte param length"))?;
let b3 = pos.next().ok_or_else(|| anyhow!("EOF reading multi-byte param length"))?;
// Compute length per spec
Ok(Some(
(((*b3 & 0x7f) as usize) << 24) + ((*b2 as usize) << 16) + ((*b1 as usize) << 8) + *b0 as usize
))
} else {
Ok(Some(*b0 as usize))
}
} else {
Ok(None) // EOF
}
}
/// Fetch FCGI param field of requested length. Read N bytes, convert to UTF-8. Error if bad UTF-8.
fn fetch_field<'a>(cnt: usize, mut pos: impl Iterator<Item=&'a u8>) -> Result<String, Error> {
let mut b = Vec::with_capacity(cnt);
for _ in 0..cnt {
let ch = pos.next().ok_or_else(|| anyhow!("EOF reading param field"))?;
b.push(*ch);
}
Ok(String::from_utf8(b)?.to_string())
}
An alternative (needed if you don't know N at const time) is passing &mut it to Itertools::chunks: it.by_ref().chunks(n).into_iter().next().unwrap().collect().
Edit: in that case, use iter.take(N).collect() (.by_ref() if necessary).
Aside: This appears suspicious; did you mean for b0 to be the high byte instead?.
Assuming that's the case, I'd probably refactor to something like this:
use anyhow::{self, Context};
fn fetch_field_length_or_eof<'a>(pos: impl Iterator<Item=&'a u8>) -> Result<Option<usize>, anyhow::Error> {
let mut pos = pos.peekable();
if pos.peek().is_none() { return Ok(None); }
Ok(Some(fetch_field_length(pos).context("EOF reading multi-byte param length")?))
}
/// Fetch one encoded value.
/// 0..127 is one byte.
/// If the first byte is larger than 127, fetch 3 more bytes and convert to a usize
fn fetch_field_length<'a>(pos: impl Iterator<Item=&'a u8>) -> Option<usize> {
let mut pos = pos.copied();
let b0 = pos.next()?;
if b0 < 128 {
Some(b0 as usize);
} else {
Some(u32::from_be_bytes([b0 & 0x7f, pos.next()?, pos.next()?, pos.next()?]) as usize)
}
}
take, like its opposite skip, is an iterator adapter, so it's lazy like all the others. The terms "take" and "skip" (and their "_while" variants) are fairly standard names for these adapters.
Slice iterators in particular could plausibly have additional methods, though, that return a subslice and advance the iterator.
For bytes I tend to find wrapping the slice or Vec in my own type and using split_at() and friends directly more ergonomic currently. Mileage may vary, I suppose?
Oops, sorry - I meant to link to the unstable next_chunk, but got interrupted by something else while I was pulling up the std docs and mixed up what I was doing when I came back