Alternative to std::fs::file

I am wondering about an alternative way of accessing files.

I am no expert on the topic, but I believe Linux (well Posix?) has primitives pread and pwrite, which take a file position, avoiding the need to seek to the correct position first ( saving a kernel call). It is also IMO awkward to have to mutate state to read a file, and std:fs::file is not Sync when I think an alternative could be. So I was wondering about the practicality of making a struct for this, or maybe a crate already exists?

( Some 14-year-old stack overflow discussion here )

1 Like

Those are available on a platform dependent extension trait.

&File implements Read, Write, and Seek, so you don't have to. (There's still mutated state going on somewhere (in the OS) -- how else would position be tracked? -- but it basically looks like shared/interior mutability from the Rust side.)

File is Sync.


All that being said, if there's more operations you want not in std yet, you could use your own extension trait. Here's how you extract the file descriptor.

4 Likes

Ah, thanks,I see File is Sync, but read takes &mut self, I think it is that which is awkward.
And for multi-threading you have the problem of something happening between the seek and the read? Anyway, I have found it problematic, or at least I have needed all kinds of wrappers to make things work as I want when multiple threads are accessing a file.

I also see there is a FileExt for Windows, but with different methods:

This is a bit complicated, I will look into it more.

This is no problem as Read is implemented for &File, not just File directly. So you can have two non-mutable references to the same file and use both for reading at the same time. You only need to produce a mutable reference to the non-mutable reference when passing it to read, but that's no problem, because you can just make more non-mutable references if you need to.

I did it here for Write, but the API is the same.

2 Likes

Hmm, not sure I understand that yet, and I think there would be lifetime problems, but can you seek and read ( or write ) from two threads using that without having problems? I am VERY confused ( which is normal for me! ).

The lifetime problems are solved by either using Arc<File>, or using scoped threads to share &File within the scope. As to the correctness problems, that depends on what you are doing:

  • It is possible and sound to use read() on a file from multiple threads — just usually not correct, since you don’t know which thread will end up with which part of the file.

  • It is possible and sometimes correct to use write() on a file that has been opened append-only, because append mode guarantees that writes don’t overwrite each other (doc). However, it’s not guaranteed exactly how much non-interleaving you get, because operating systems make different promises here, so you need to either check your specific OS or tolerate occasional interleaving (e.g. in a diagnostic log file).

  • It is possible and normal to read() and write() the same file, each from a different thread, when the file is actually a device that offers bidirectional communication, like a TTY/PTY, Unix domain socket, or serial port.

  • You could mix read() and write_at(), or write() and read_at(), so that only one thread is using the current position and all other threads are not.

7 Likes

I get it ( I think ), thanks.

Meanwhile I seem to have something working on Linux using std::os::unix::fs::FileExt which does what I need. Will continue tomorrow.

Well, after many blunders, I think I have something that works:

It uses the windows or unix FileExt if available, if not it falls back to an Arc/Mutex solution.

It seems most platforms support windows or unix ( which I didn’t realise before ).

...and there is: File::try_clone

Yes, I ended up using that. I don’t think Arc by itself does any good, as to use read you need exclusive access to File. But the FileExt trait impls seem ideal, at least for windows and unix ( is there anything else in reality? ).

No, you don’t. You can get &File from Arc<File>, and as @increasing already mentioned, Read and Write are implemented for &File. You just have to get the &File explicitly with an &* or a coercion.

use std::fs;
use std::io::Write;
use std::sync::Arc;

fn main() {
    let write1: Arc<fs::File> = Arc::new(
        fs::OpenOptions::new()
            .create(true)
            .append(true)
            .open("test.txt")
            .unwrap(),
    );
    let write2 = write1.clone();

    // We could also use std::thread::scope here, but in that case, there
    // would be no point in using Arc, and & by itself would do.
    let join1 = std::thread::spawn(move || {
        (&*write1).write(b"hello").unwrap();
    });
    let join2 = std::thread::spawn(move || {
        (&*write2).write(b"world").unwrap();
    });

    join1.join().unwrap();
    join2.join().unwrap();

    // will print either "helloworld" or "worldhello", + previous contents
    println!("{}", std::fs::read_to_string("test.txt").unwrap());
}

Ok, but then you have the seek/read data race issue, so that is no use for me. I do think that is a bit of a trap, considering Rust’s “fearless concurrency” “no data races” approach. It doesn’t seem to mention it very explicitly in the documentation either, unless I am missing it.

Sure. For your original question, read_at() and write_at() are what you need, and those methods take &self so you don't even need the (&*file) trick. I was only responding to “I don’t think Arc by itself does any good”.

1 Like