Working with slices

Good day all. I am having some trouble extracting a slice from a byte array in a form that I can then use. Defining the slice is easy enough but I cannot find the correct types to assign it to. The function takes a byte buffer and looks for a string of dashes as a start marker for the data. It then needs to extract the bytes/chars/str(s)/strings at various offsets from the first dash. In one case I need to get an indexable type as I need to further process it byte by byte, and in the other cases a string or str is required. The code has some of my attempts at it. Any suggestions would be appreciated. Thanks

fn zebra_format(in_buf: &[u8]) -> String {
    let mut done: bool = false;
    let mut position: usize = 0;
    while done == false {
        while position < in_buf.len() - 5 {
            if in_buf[position] == 0x2D && in_buf[position + 1] == 0x2D && in_buf[position + 2] == 0x2D
               && in_buf[position + 3] == 0x2D && in_buf[position + 4] == 0x2D {
                break;
            }
            position += 1;   
        } 
        let mut date = String::new();
        date = date + in_buf[position + 49..11];
        let trans: [u8] = in_buf[position + 121..8];
        let trans: [u8] = &in_buf[position + 121..8];
        let acct: str = in_buf[position + 247..40];
        let prod: [u8; 38] = in_buf[position + 348..38];
          //some logic to determine if at end of in_data. 
        done = true;
    }
    "abc\r\n".to_string()
}

A slice (e.g. [u8]) always needs to be behind some kind of reference, so you should be writing things like let trans = &in_buf[5..2]. It's the same with a str, you always write &str because it represents a pointer to some string data elsewhere (in_buf in this case).

Also, I don't think expressions like position + 49..11 do what you think they do. The syntax 49..11 will create a range starting at the 49'th byte going forward to the 11'th byte... Which is an empty range because the end is before the start.

Your code could probably be written a lot more expressively, but without knowing what all those byte offsets mean and how they were determined I can't show you how I'd write the parser.

One way you could change your code to fix the compile errors:

fn zebra_format(data: &[u8]) -> String {    
    if let Some(payload) = find_marker(data) {
        let date: &str = std::str::from_str(&payload[49..49+11]).expect("bad date");
        let trans: &[u8] = &payload[121..121+8];
        let acct: &[u8] = &payload[247..247+40];
        let prod: &[u8] = &payload[348..348 + 38];
        
        todo!("Put the parsed data into a struct so we can do something with it")
    }
    
    todo!("Handle the possibility that we couldn't find a marker")
}

/// Scans through `data` to find an occurrence of our marker, then return 
/// a pointer to the bytes after it.
fn find_marker(data: &[u8]) -> Option<&[u8]> {
    let marker = b"-----";

    for i in 0..data.len() {
        let possible_marker = &data[i..i + marker.len()];
        if possible_marker == marker {
            let rest = &data[i + marker.len()..];
            
            return Some(rest);
        }
    }
    
    None
}
1 Like

@Michael-F-Bryan Thank you for the reply. As I read the slice documentation in the book it said that the second value was the length, not the ending index. e.g [start_idx..legth]. I will change up the slice calls and reformat some things as you suggest and see what happens. Have a good day.

It was probably saying that a reference to a slice (i.e. &[u8]) consists of a pointer to some bytes and the length. You can think of &[T] being syntactic sugar for this:

struct Slice<T> {
  first_element: *const T,
  length: usize,
}

Then when you index into a &[T] you pass it a range expression indicating the start and end indices you want. It'll then do the pointer arithmetic to return a new slice with an updated first_element pointer and length field (with bounds checks, of course).

The syntax a..b is shorthand for creating a std::ops::Range<usize>, which is defined as.

pub struct Range<Idx> {
    pub start: Idx,
    pub end: Idx,
}

@Michael-F-Bryan That makes sense now. I hadn't clued in to the Range piece. I have changed the calls and they are now working as expected. [(position + 49)..60]. Thanks again.

1 Like

The important thing is that a slice is not a range. [T] and Range<T> are two entirely distinct types.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.