Atomic file / write-ahead logging, crate?

Is there a Rust crate that provides atomic operations for large files? That is, you read/write the file as normal, but the writes are saved up (not written to permanent storage) until a commit/flush operation, at which point either they all happen or none happen ( hence "atomic" ). This implies the writes have to first be saved in a temporary file, before they are applied to the main file.

I started writing my own implementation today, but if there is an existing implementation it could save me some work, and maybe someone knows how to do it better than I do. I think it's also known as "write-ahead logging".

My implementation is now working ( and I even wrote a unit test...! ). It's not entirely clear to me whether what guarantees std::fs::sync_all gives.

For example, if I have a file of size zero, I perform a series of writes to the file, and then call sync_all.
After the call completes, I know the state of the file. But what about before? Is the file size set last?

I have tried to take a precautionary approach, by writing a zero at the start of the file ( which means there is nothing valid in the file at this point ), writing other data to the file, calling sync_all, then writing a value at the start of the file which indicates it is now valid (the file size), then calling sync_all again.

It's hard to imagine this going wrong, but it's not clear to me if this is optimal, or entirely clear whether it is correct. But I don't see what else can be done, although I have seen suggestions of using rename ( but that seems like a nasty kludge to me ).

File synch is really tricky.

I have been trying to figure out what I am expecting from the filesystem. One way of looking at it is I am assuming the filesystem will not produce "undefined values from nowhere". So if I start from an empty file, I write data to it ( sequentially if it matters, zeroes if it matters ), and later read the values, either it reports some kind of failure ( that's ok ), or it reports nothing was read ( that's also ok ), or it gives me the zeroes I wrote ( that's also ok ). But not some random non-zero value I never wrote at all! This seems reasonable...

Edit:

Have been reading here. Quote:

"SQLite assumes that when a file grows in length that the new file space originally contains garbage and then later is filled in with the data actually written. In other words, SQLite assumes that the file size is updated before the file content."

Which seems a bit back-to-front to me....

It also says:

"The first flush writes out the base rollback journal content. Then the header of the rollback journal is modified to show the number of pages in the rollback journal. Then the header is flushed to disk."

which is exactly what I am doing, which is re-assuring. Well, except I am rolling forward not back, but that's not pertinent.

"Flush to disk" is the operative phrase. The author of the paper I linked found that the disk often returned before the data was actually on the disk even after turning off all optimizations. The result was lost data if the disk failed at an inopportune instant. I don't recall that he found any problems in the absence of disk failure, e.g., lost or out of order writes.

Yes, that's not really a problem** What would be a problem is re-ordering of writes across a sync-all boundary. At some point, if the OS is broken, there is nothing that can be done ( except recover from other logs etc. ).

** Well it might be a problem in some applications, but it doesn't lead to data corruption.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.