How to handle match with irrelevant Ok(_)?


#1

Hi, I have a simple program which reads a file. (It is a little bit verbose, because I try to learn somethings.) Anyway… I have a match where I’m interested in the Err case, but not the Ok case. What is the most idiomatic way to handle this?

use std::error::Error;
use std::fs;
use std::fs::File;
use std::io::Read;
use std::str;

fn main() {
    let mut file = match File::open("hello.txt") {
        Err(err) => panic!("Couldn't open: {}", err.description()),
        Ok(file) => file,
    };

    let metadata = match fs::metadata("hello.txt") {
        Err(err) => panic!("Couldn't get metadata: {}", err.description()),
        Ok(metadata) => metadata,
    };

    let mut buffer = vec![0; metadata.len() as usize];
    match file.read(&mut buffer) {
        Err(err) => panic!("Couldn't read: {}", err.description()),
        Ok(_) => print!(""),
    }

    let data = match str::from_utf8(&buffer) {
        Err(err) => panic!("Couldn't convert buffer to string: {}", err.description()),
        Ok(value) => value,
    };

    print!("Content is: {}", data);
}

I’m talking about this line: Ok(_) => print!(""),. Ok(_) => _, doesn’t work :slight_smile: Can I use some builtin noop for this?

(Side question: Why must file be mut? I don’t see why file.read mutates file? Shouldn’t it just mutate buffer?)


#2
if let Err(err) = file.read(&mut buffer) {
     panic!(...);
}

However, it’s usually a bad idea to use panic! for this kind of thing at all.


#3

What would be better than panic!? (I’m still learning Rust.)


#4

You can return the unit type: Ok(_) => (). In example/test code, it is more idiomatic to use file.read(&mut buffer).expect("Couldn't read").


#5

The file handle must be mutable because its position in the file changes when reading from it.


#6

There must be something I’m missing, because you can just use .expect(), can’t you?

use std::fs;
use std::fs::File;
use std::io::Read;
use std::str;

fn main() {
    let mut file = File::open("hello.txt").expect("Couldn't open");
    let metadata = fs::metadata("hello.txt").expect("Couldn't get metadata");
    let mut buffer = vec![0; metadata.len() as usize];
    file.read(&mut buffer).expect("Couldn't read");
    let data = str::from_utf8(&buffer).expect("Couldn't convert buffer to string");
    println!("Content is: {}", data);
}

But if you really need it, there are at least two ways to get a no-op. The key idea is to use some expression with no side-effects. The important thing is that the two match arms must have the same type. In this case, panic! returns a (). This is both the empty tuple type and a literal for its only value. It’s used in place of C’s void, that is, whenever there is no meaningful type. This includes functions that return nothing and simple statements.

So, to achieve the no-op you can either specify a () value directly or use an empty block {}.

By the way, I’m surprised that panic! isn’t marked as diverging. Or is it?

Sure, reading from a file mutates the object that represents it, in particular it changes the reading position. Imagine several threads trying to simultaneously read from a single file.


#7

[quote=“donaldpipowitch, post:1, topic:6291”]
let metadata = match fs::metadata(“hello.txt”)
[/quote]File has metadata method, by calling file.metadata() you can avoid race conditions.

There’s no inherent need for the file handle to be mutable, nothing is mutated in the userspace. The error is caused by read taking &mut self but Read is implemented for &File too.


#8

Thanks.


#9

Yeah, this seems totally strange to me. As in, you’re really mutating the file by reading it. Maybe there’s no additional metadata stored in the userspace, but you’re making a syscall, and a mutating one. It mutates the state of the file as a kernel structure that you refer to by the fd.

Having Files readable via a shared reference seems like asking for data races, so I don’t… get it. Can somebody please explain why is it implemented this way?


#10

Awesome. Thanks. Actually… I looked for that, but couldn’t found anything like it. Now I searched it again and immediately found it: https://doc.rust-lang.org/std/fs/struct.File.html#examples-6.


#11

What do you mean with “diverging”? A specific compiler warning? I didn’t get anything like that.

Actually I haven’t seen the .expect() style before. I tried to follow http://rustbyexample.com/std_misc/file/open.html so far. Is this outdated? Thank you for your help.


#12

https://doc.rust-lang.org/book/functions.html#diverging-functions

There is this type ! (that is kind of not a real type, but there is a proposal to make it a real one) that is used to signify that this function or expression will never ever return to the environment it’s been called from. Some functions, like, presumably, panic! (oh it’s a macro, but who cares) return this type, and statements like return and break do, too – because they jump to some other piece of code, unlike most functions that return the result back to where they were called.

Now, I surely expect panic!() to have the type !, but the match statement doesn’t think so: https://is.gd/GSEkX7


#13

I don’t think it’s outdated, there is as whole section called “Error handling” before the section you’re reading, and it does tell you about .expect(). There’s also a giant essay about error handling in The Book: https://doc.rust-lang.org/book/error-handling.html


#14

[quote=“bugaevc, post:9, topic:6291”]
It mutates the state of the file as a kernel structure that you refer to by the fd.

Having Files readable via a shared reference seems like asking for data races
[/quote]The kernel must protect itself from misuse regardless and on the application side, how would that happen (assuming this data race definition)?


#15

Sure, each exact read call is atomic or whatever, so it’s all clear as far as the kernel is concerned.

But it’s not right from the program’s perspective. Each read call mutates the file as the program sees it (namely, the current position). One thread might not expect the file [position] to change between two read() calls because of some other thread trying to read it simultaneously. It’s the exact same situation as advancing iterators via the .next() call that takes a mutable reference, ensuring it’s the only one who gets to advance it.

Now yeah, a read() skipping some data here and there because of some other thread reading the same file won’t exactly corrupt your program and lead to segfaults as usual memory-related data races do. Still it may badly affect the behaviour. And it’s as hard to debug as most concurrency-related bugs are.

The Nomicon says that Rust does not prevent general race conditions, and sure that’s true, that would be an awful interface. But since files and streams are so similar to iterators, I think it would make a lot of sense to require a unique reference to read them.


#16

The problem is that main returns the result of the match expression, so the whole expression is inferred to have the type (). If you add a semicolon after the match expression, it compiles.


#17

Oh, thanks. That was embarrassing. Maybe the error message could be improved? Probably. Or not.


#18

Let me rephrase that.

Reading a file is mutating the object that represent the file stream, namely altering the position. The very data being mutated is stored in the kernel space, and simultaneous mutation won’t cause the kernel any trouble because it is smart enough to synchronize accesses.
Still, two threads simultaneously reading the same file stream will break internal logic. (I guess even a single read() call on BufReader could yield a non-contiguous slice).

Let’s take a step back and consider Rust shared/unique const/mutable reference rules. It seems to me that they prevent two different kinds of problems with concurrent access:

  • Accessing a data structure while someone else is mutating it can lead to its corruption, segfaults and stuff. Think iterator and reference invalidation. This is not the problem for file streams, since the kernel already uses some kind of mutex to ensure everything is smooth.
  • Accessing something while someone else is mutating it can lead to internal logic breakage, i.e. some internal invariant being broken. The memory will be fine in this case, but the program flow can go wrong.

When working with iterators, which file streams are closely related to, both of these are prevented by the Rust referencing rules. However being able to mutate a file-structure state via a shared reference will cause the second issue.

Most sane programs will never need to simultaneously read the same file stream as far as the logic goes. Yet providing the ability to do it via shared references could lead to many subtle bugs.

The right way to handle the multithreaded case is with the usual stuff: Mutexes and Arcs.