Best option to read from slices with EOF? Cursor is bad I guess

This is one way to read from a slice:

use std::io::Read;
use std::io::Cursor;

fn main() {
    let slice = vec![0u8; 7];
    let mut cursor = Cursor::new(slice);
    let mut nal_lenght_bytes = [0u8; 4];
    let bytes_read = cursor.read(&mut nal_lenght_bytes);
    println!("bytes_read: {:?}", bytes_read);
    let mut nalu = [0u8; 3];
    let bytes_read = cursor.read(&mut nalu);
    println!("bytes_read: {:?}", bytes_read);//reached EOF, but I don't know
}

however I don't get informed when I reached EOF. Of course, if I try to read again, I get Ok(0), but this prevents me from doing nice

while let Ok(bytes_read) = cursor.read(&mut nalu) {

}

and also I have to treat std::io::Error which never happens for slices. So I guess Cursor is not the best option to read from slices. I could of course use read_until but it does not fit for me because I want to read an exact quantity and also I don't want to read to a vec.

What should I use?

Maybe you want chunks or chunks_exact:

fn f(slice: &[u8]) {
    let (nal_length_bytes, rest) = slice.split_at(4);
    println!("length length: {}", nal_length_bytes.len()); // Always 4
    for nalu in rest.chunks(3) {
        println!("nalu length: {}", nalu.len()); // Could be 3 or less
    }
}

fn g(slice: &[u8]) {
    let (nal_length_bytes, rest) = slice.split_at(4);
    println!("length length: {}", nal_length_bytes.len()); // Always 4
    let mut chunks = rest.chunks_exact(3);
    for nalu in &mut chunks {
        println!("nalu length: {}", nalu.len()); // Always 3
    }
    
    let nalu = chunks.remainder();
    println!("remainder length: {}", nalu.len()) // Always < 3, could be 0
}

fn main() {
    let slice = vec![0u8; 8]; // Note: increased length to 8 for illustration
    f(&slice);
    println!();
    g(&slice);
}

Playground.

1 Like

the problem is that with split_at I cannot use pattern matching to see if the nal_length_bytes is Some(&[u8; 4])

This is what I was trying to do, actually:

#[derive(Debug)]
pub enum AnnexConversionError {
    NalLenghtParseError,
    NalUnitExtendError,
    IoError(std::io::Error),
}

pub fn avcc_to_annex_b_cursor(
    data: &[u8],
    nal_units: &mut Vec<u8>,
) -> Result<(), AnnexConversionError> {
    let mut data_cursor = Cursor::new(data);
    let mut nal_lenght_bytes = [0u8; 4];
    while let Ok(bytes_read) = data_cursor.read(&mut nal_lenght_bytes) {
        if bytes_read == 0 {
            break;
        }
        if bytes_read != nal_lenght_bytes.len() || bytes_read == 0 {
            return Err(AnnexConversionError::NalLenghtParseError);
        }
        let nal_length = u32::from_be_bytes(nal_lenght_bytes) as usize;
        nal_units.push(0);
        nal_units.push(0);
        nal_units.push(1);

        if nal_length == 0 {
            return Err(AnnexConversionError::NalLenghtParseError);
        }
        let mut nal_unit = vec![0u8; nal_length];
        let bytes_read = data_cursor.read(&mut nal_unit);
        match bytes_read {
            Ok(bytes_read) => {
                nal_units.extend_from_slice(&nal_unit[0..bytes_read]);
                //TODO: this is never called so we don't ever detect EOF
                if bytes_read == 0 {
                    break;
                } else if bytes_read < nal_unit.len() {
                    return Err(AnnexConversionError::NalUnitExtendError);
                }
            }
            Err(e) => return Err(AnnexConversionError::IoError(e)),
        };
    }
    Ok(())
}

it works, however I have to rely on if bytes_read == 0 { break; } which basically defeats the purpose of doing while let Ok(bytes_read) = data_cursor.read(&mut nal_lenght_bytes), as we never reach Err because the slice does not have I/O errors. Cursor is almost what I need, but I needed EOF detection.

I guess I don't understand exactly.

Based on the code, if you...

  • Are at the end of input (0 bytes left), break and return
  • Can't read a length (1 -- 3 bytes left), that's an error
  • Read a length but it has value 0, that's an error (after pushing 0 0 1)
    • (Is it supposed to be or is this another attempt at detecting a read error or something? Since your initial value is 0 and you're returning a parse error here...)
  • Otherwise, continue (after pushing 0 0 1)

And then given the parsed length, if you...

  • Can read all length bytes, push the bytes and continue
  • Can read more than 0 - but not all - length bytes, that's an error (after pushing what we did read)
    • I would expect this error in the read 0 (EOF-immediately-after-length) case too
  • Cannot read any bytes (hit EOF immediately), that's okay (???), break and return
    • This -- immediately following your TODO -- is the main logic I don't understand
    • And the comment that "this is never called" isn't true

Ignoring my lack of understanding, here's a version without Cursor. If that part that confuses me is actually supposed to be an error, you could just delete that branch (same as in your code). And if being passed a length of 0 is actually ok, you could delete that branch too.

If you really want to use pattern matching instead of just checking the length, it's possible.

        // Expecting a length here
        let (nal_length, rest) = match data {
            &[l0, l1, l2, l3, ref rest @ .. ] => {
                (u32::from_be_bytes([l0, l1, l2, l3]) as usize, rest)
            }
            _ => return Err(AnnexConversionError::NalLenghtParseError),
        };

A return of Ok(0) is EOF detection when using the Read trait with a non-zero-length buffer. (But I agree Cursor isn't a good fit here.)

2 Likes

I haven't looked into the rest of your actual use case in detail, so there may be better solutions for your use case, but for this exact remark I'd do:

while let Ok(Some(bytes_read)) = cursor.read(&mut nalu).map(NonZeroUsize::new) {
1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.