Result from BufRead.read_line() vs .lines()

RustyOldMan · August 10, 2020, 11:25pm

As I am learning Rust I am converting some applications from Java. One application is to read a CSV file. I ran into an interesting scenario that I am trying to understand.

My original code using .read_line():

 let file = File::open(filename)
            .expect("FILE NOT FOUND!!");

    let mut file_buffer = BufReader::new(file);
    let mut file_header = String::new();

    file_buffer
            .read_line(&mut file_header)
            .expect("NO DATA!!");

Yields the same println!() as this code that uses .lines():

let mut file_iter = file_buffer.lines();
let data = file_iter
        .next()
        .unwrap()
        .unwrap();

Both code return a String, but I'm trying to understand the need for the .unwrap().unwrap() when iterating. I am suspecting it's because read_line() returns io::Result and lines() returns an iterator?

Also, is unwrap().unwrap() idiomatic, and if it's not, what would be proper?

steffahn · August 11, 2020, 12:13am

Yes, one of the unwrap is because of the io::Result and one because of the iterator. Going into detail, looking at lines(), it returns a custom iterator type, Lines. That one has an Iterator implementation that according to the docs has the following type for next():

fn next(&mut self) -> Option<std::io::Result<String>>

So your first unwrap turns this return type of next() into std::io::Result<String>, crashing the program if there “was no line” (next() returned None, i.e. the lines() iterator was empty), and the second unwrap turns that into String, crashing the program if there was any error while trying to read the file.

Well, kind-of unidiomatic in the sense that you might not want your program to crash that easily. Unless you’re building an application and see no way to recover anyways in which case it can be fine. Well, except you may still want to improve the error message, so those unwraps are rather uncommon in practice, I guess.

In general, the time when unwrap() is most appropriate is in situations where the unwrap actually cannot fail, in the sense that failure would indicate a bug in your program, since when hitting a bug a crash with a not-too-fancy error message is probably okay.

The most straightforward alternative to unwrap is pattern matching to handle the error case differently.

There also is the ? operator for error propagation, but that doesn’t really work too well on its own yet for handling both Option and Result in the same function. You could probably do things like e.g. ...next()?.ok()? in a function returning Option<...> or ...next().ok_or("empty file")?? in a Function returning Result<..., Box<dyn Error>>.

Also if you’re wondering about whether there are any extra differences between the read_line approach and the lines approach, you can look at the source code:

impl<B: BufRead> Iterator for Lines<B> {
    type Item = Result<String>;

    fn next(&mut self) -> Option<Result<String>> {
        let mut buf = String::new();
        match self.buf.read_line(&mut buf) {
            Ok(0) => None,
            Ok(_n) => {
                if buf.ends_with('\n') {
                    buf.pop();
                    if buf.ends_with('\r') {
                        buf.pop();
                    }
                }
                Some(Ok(buf))
            }
            Err(e) => Some(Err(e)),
        }
    }
}

What we see is: The lines iterator uses read_line. Only an empty file is considered to have “no lines” (next() returning None), also the lines iterator strips the line break, while read_line doesn’t.

RustyOldMan · August 11, 2020, 12:34am

Thanks for the reply! I println!({:?}) and started to see exactly what you described. Thanks for confirming that I'm on the right track of understanding.

I also understand your meaning the method I was using is unidiomatic. I have to admit, I've let laziness while learning, set in by not writing in a match structure for my Option and Result.

I didn't even think to look at the code source Sharing that excerpt has helped clear up a few things. Thanks again!

RustyOldMan · August 11, 2020, 1:24am

Thought I'd share my solution example for other noobs:

let file = File::open(filename)
.expect("FILE NOT FOUND!!");
let mut file_buffer = BufReader::new(file);
let mut file_iter = file_buffer.lines();
let mut file_header = String::new();

match file_iter.next(){
    Some(t) => match t {
        Ok(line) => file_header.push_str(&line),
        Err(_) => panic!("COULD NOT READ LINE"),
    }
    None => panic!("NO DATA"),
};

carey · August 11, 2020, 2:05am

Something that I initially missed with how BufRead iterators return Option<Result<String>> is how this interacts with collect(), which can collect one of those iterators into Result<Vec<String>>. This can then be followed by a single ? or .expect() to conveniently get all of the lines in a file.

steffahn · August 11, 2020, 10:05am

Well, if all you do is panic, .expect("NO DATA").expect("COULD NOT READ LINE") would be a bit cleaner.

You could also try to incorporate the actual error into your error message, like

let file_header = file_iter
    .next()
    .expect("NO DATA")
    .unwrap_or_else(|e| panic!("COULD NOT READ LINE: {}", e));

If you want to use match, which is also fine I guess as an alternative to long unwrap_or_else expressions, you don’t need to nest them.

let file_header = match file_iter.next() {
    None => panic!("NO DATA"),
    Some(Err(e)) => panic!("COULD NOT READ LINE: {}", e),
    Some(Ok(line)) => line,
};

In case you prefer assigning to a variable instead of using match like an expression (the latter being more ideomatic though), please note that there’s also the option of splitting declaration and initialization of a variable. You don’t need to use mut and a String::new() with String::push_str() to do this.

let file_header;
match file_iter.next() {
    None => panic!("NO DATA"),
    Some(Err(e)) => panic!("COULD NOT READ LINE: {}", e),
    Some(Ok(line)) => file_header = line,
};
// file_header is now initialized and usable

RustyOldMan · August 12, 2020, 6:25pm

Yes, I like the cleaner code using expect() and unwrap(). But, I was thinking about your previous post about un-idiomatic.

Yes, right now, I just panic! and do not divert a way to truly capture the errors. But, since I am coming from Java and learning, I'm trying to force my brain into utilizing "match" statements.

It's kinda funny....the cleaner code is simpler....but somewhat "lazy" for an initial learner....but it's more idiomatic. I'm a firm believer in fundamentals of learning: getting the basics down first and understanding "the why", so the abbreviated or shortcuts then makes sense why you can use and should use it. "Teach me to fish, so I can enjoy the meal you serve me.... "

system · November 10, 2020, 6:25pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Understanding BufReader and iteration through lines help	3	1763	January 12, 2023
Confusion with "Result" type	3	1012	January 12, 2023
Is it possible to parse the file line by line without doing an allocation per line help	4	1193	March 11, 2022
Read a file line by line help	10	40763	July 3, 2022
Unpack BufReader line into variables of different data types help	5	600	April 12, 2022

Result from BufRead.read_line() vs .lines()

Related Topics