What is the best way in Rust to append/send records to a single file from multiple processes and threads

I am reading conflicting advice on this both at what Linux is capable and what is write in RUST.
Need some help.

I will be intercepting file system open libc calls with an LD_PRELOAD library and I need it to eventually write tracking information on which files are read or written, symlinked, etc to a wisktrak.file.

In a very large -jN build, there can be multiple processes that could be active in parallel and writing to this file. some of these build tools themselves could be multi-threaded and could be doing file open calls in parallel and hence wanting to write file tracking data in parallel within a process as well.

Options I am considering:
1. create a fifo pipe and have every spawned process write to this pipe. Keep each write to be under 4096 bytes, which I read is atomic block write guarantee in linux. I had this implemented in my C version of the program I have seen this happen fairly well. without records being mangled with others. over a fair bit of testing.
2. I recently started rewriting in RUST and switched to writing directly to single file. again within 4096 byte limit. Seems to be working. havent seen any mangling. But hasnt been long, and it only takes one mangle to kill this or the pipe idea.
3.Each process writes to its own unique file. Can generate a lot files and we still have the problem of multiple threads threads and if they are safe from mangling when writing to the same file/fd.
4. Last option, preferably not, is to start a daemon/listener process that writes the records and do IPC/socket communication from each process or thread to the daemon/listener.

Lastly is there any RUST packages that people are familiar with that will simplify this process.

You could spawn a thread and then use crossbeam-channel to send the data to it. (preferably avoid std::sync::mpsc. it has several flaws)

According to its manual page, the write syscall will prevent interleaving if all the process’ file descriptors are copies of each other (i.e. the file was opened once in the parent process and then everything inherited it through fork or similar). So the only thing you should need to do is verify that the higher-level libraries you’re using issue each record as a single write call.

4096 is the default buffer size for pipes on Linux, but file writes are allowed to be arbitrarily-sized.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.