Correct way to save a file atomically but without interferring with performance

Up until now I've been using fs::write(TMP, data); fs::rename(TMP, DEST); to save data for my app; but I recently learned that that the rename is allowed to complete before the data is fully written, which could lead to the resulting file containing only part of the data.

I understand the recommendation is to use File::sync_all before closing the temporary file, but it's also my understanding that File::sync_all is a request to the kernel that data be written as soon as possible, and this is where I wonder, is there a middle ground where I can guarantee atomicity (that the file is in either the old or new state) but let the kernel decide how long it takes (e.g. it would be fine if it took a few minutes to commit to disk).

For my purposes this runs on a Linux server, although if there's a cross-platform solution that would be preferred.

What is it that you're trying to avoid? The data is going to have to get written to disk eventually, so you're not typically losing much by asking for it to happen sooner rather than later.

Please provide a reference.

Are you concerned about power failure?

It's commonly accepted that the high-performance answer here is write-ahead logs plus checkpointing. Additionally, doing this correctly is highly OS-dependent.

  1. The WAL can take a single-file disk sync before you report success to the client;
  2. Production of a new full database checkpoint can be offloaded to a background/latency-tolerant process;
  3. During recovery, you can be assured that any corrupted WAL segments did NOT have a success code reported to the client and so they can be ignored.

You'll want to look into how databases like Postgres, etcd, and sqlite do their file syncing.

Optimistically, I would imagine this would make more efficient use of disk spin-ups (where more changes can be buffered before spinning up the disk) and avoid interfering with the latency of other higher priority IO on the same machine.

It sounds like there isn't a trivial way to achieve such a thing though, so I'm content to accept that adding the File::sync_all is probably the best solution.

1 Like

My conclusion was based on these two answers from stackexchange:

https://unix.stackexchange.com/questions/464382/which-filesystems-require-fsync-for-crash-safety-when-replacing-an-existing-fi
https://stackoverflow.com/questions/7433057/is-rename-without-fsync-safe

From those I concluded that, in the best case, the behavior is that the newly created file is committed before the rename, but that the developers of the filesystem identify this pattern as a bug in the application they are begrudgingly working around, or that in the worst case, as stated in the original post, the data of the newly created file is not guaranteed to be committed before the rename and you can end up with a partial file after a power-loss or other system crash.

1 Like

You typically only need File::sync_data() here, though the practical difference is usually going to be zero. sync_data() does enough to ensure that you can read back all the data that was written, but doesn't bother updating any metadata about the file that isn't required to read the data. This usually means that if the file's length has changed then it's equivalent to sync_all(), and if you're creating a new file then the length will have changed (unless you wrote zero bytes)... but it's only actually required to use sync_all() if you care about other metadata of the file (like timestamps or permissions).

But, yeah, your general conclusion is correct: you can't safely avoid explicitly syncing the data to disk without relying on specific knowledge of which filesystem is being used and how it's configured, even just on Linux, let alone cross-platform. No platform's file APIs that I know of make any promises about whether the order of operations will be preserved in the case of a system failure.

4 Likes

I believe the correct way would be writing a wrapper, implementing Drop and scoping the instance :face_with_monocle:

I’m not sure I understood what’s going on here correctly though…

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.