Creating a slice of [u8]

I am new to rust and new to programming and I am writing some simple programs to get the feel for the language. I am attempting to read into a file, in this case an xml file. I want to make match expressions that find the angle brackets in each line. My idea was to make a slice of the [u8] that is returned after the as_bytes() function is called. I'm sure there is a better way to do this, but I am getting the "mismatched type error" when trying to compare values in the match arms.


fn main() {
    let file_path = Path::new(r"C:\Users\user\RustProjects\myproject\test.xml");
    let reader = get_buf_reader(file_path);
    for line in reader.lines() {
        // read lines as bytes
        let line = &line.unwrap().as_bytes();
        // get slice
        match lines[..0] {
            // mismatched type error expected [u8] found u8 
            60u8 => println!("Open angle bracket found"),
            _ => println!("No bracket found"),
        }
    }
}

fn get_buf_reader(file_path: &Path) -> BufReader<File> {
    let file = File::open(file_path);
    let file_success = match file {
        Ok(file) => file,
        Err(error) => panic!("Problem opening file: {error:?}"),
    };
    let reader = BufReader::new(file_success);
    reader
}

A slice of bytes contains zero or more bytes. But it looks like you want to match on a single byte. To do that you would index into the slice of bytes like this: lines[0].

But since Strings contain unicode chars, it would be better to match on the first char, not the first byte. You can use the chars method to get an iterator of chars in the string, and then call next to get the first char. Note that the string may be empty and next may return None.

Also, lines[..0] will create an empty slice, since 0 is the last index plus one. All ranges in Rust are specified as the first index and the last index plus one. To specify a slice of one byte use: lines[..1].

Here is a version with these changes that compiles:

    for line in reader.lines() {
        // ignore errors for now
        let line = line.unwrap();
        // match on first char
        match line.chars().next() {
            Some('<') => println!("Open angle bracket found"),
            _ => println!("No bracket found"),
        }
    }
1 Like

So it would be better to compare the char rather then the ASCII decimal that represents the char?

A single Unicode char can be represented by more than one byte. Even when they are represented by one byte, they are not always ASCII.

Rust strings are in UTF8 format, where each char is represented by one to four bytes.

EDIT:
ASCII is a subset of Unicode. So you can compare char to ASCII values, and in fact '<' is an ASCII char. I just used the char literal syntax: '<'. I could have also used the hex syntax: '\x3c'.

That was my original plan was to use the decimal or even hex syntax in my match arms. So I could use \x3c, 60, or '<'?

All of those work except for '60'. The only ASCII escape is hex.

Thanks for the help!

1 Like

You're welcome!

Another neat option, which doesn't require you to look up the ASCII number, is to use the byte literal b'<'.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.