How to write/replace files atomically?

ikevin · May 18, 2020, 8:00am

I need to write and replace existing HTML static files on a very frequent basis. These static files are being read by nginx web server running on FreeBSD. There should be no circumstances where a 404 not found error could occur. That means, while the file is being replaced, nginx should either read the old version or the newly replaced version, and never a 404 or file corruption problem.

My problem is similar to as being described in this thread.

What is the best way to implement this using Rust?

alice · May 18, 2020, 8:22am

You should first write the file to some other location on the same disk, then you should use rename to move it. This should be atomic.

Note that the file must not be moved across mount points. E.g. /tmp is often on a different mount point than the rest of the system.

ikevin · May 18, 2020, 9:31am

Sounds simple. Are you absolutely sure "rename" is atomic? Any references to read to confirm this?

(The volume of files will be a lot and need to be simultaneously replaced without nginx getting any file not found error)

alice · May 18, 2020, 9:35am

Yes, rename is atomic in the sense that you need it to be, at least on Linux (I have no idea what Windows does). You can see this in the man page.

If newpath already exists, it will be atomically replaced, so that there is no point at which another process attempting to access newpath will find it missing. However, there will probably be a window in which both oldpath and newpath refer to the file being renamed.

kornel · May 18, 2020, 2:33pm

tempfile's persist does that.

ikevin · May 18, 2020, 11:30pm

What about this package: https://crates.io/crates/fast_rsync ?

parasyte · May 19, 2020, 3:08am

No, you should use rename. (The suggested tempfile::persist() uses it under the hood.)

daboross · May 19, 2020, 3:18am

I don't think fast_rsync even provides functionality for writing to files. It's a library for using rsync's protocol.

So, you should use it if you want rsync's protocol. Even if you do, fast_rsync does not implement writing to files, so if you want atomic replacement for that, you will also need to use rename.

If you're using nginx, then I would assume you're on linux. rename is guaranteed to be atomic there, so I don't think there should be any issues?

kornel · May 19, 2020, 10:27am

BTW, rename being atomic is a POSIX guarantee.

ikevin · May 20, 2020, 8:35am

Thank you guys! Looks like "rename" is the way to go. The reason I'm so concerned is because of the difficulty of testing it. I just have to deploy and 'pray' that nginx would never return a 404.

Riateche · May 20, 2020, 5:34pm

A common practice is to deploy the new version at a separate document root, then update a symlink to point to the new directory. Note that you still need to use rename to change the symlink atomically, see this answer. You may also need to change the nginx configuration to enable following symlinks. Another benefit is that you can quickly change the symlink back to an older version if you need a quick rollback. For example, this article describes this approach. You can also take a look at capistrano that also uses symlinks for atomic deployment.

ikevin · May 21, 2020, 2:44am

Thanks for all your help!

uberjay · May 21, 2020, 4:31am

If you care (... and you may not! :), things are a lot more complicated than it first appears. While rename is operationally atomic, that doesn't necessarily result in atomicity in the face of system reboot or crash.

From this article:

Similarly, if you encounter a system failure (such as power loss, ENOSPC or an I/O error) while overwriting a file, it can result in the loss of existing data. To avoid this problem, it is common practice (and advisable) to write the updated data to a temporary file, ensure that it is safe on stable storage, then rename the temporary file to the original file name (thus replacing the contents). This ensures an atomic update of the file, so that other readers get one copy of the data or another. The following steps are required to perform this type of update:

create a new temp file (on the same file system!)
write data to the temp file
fsync() the temp file
rename the temp file to the appropriate name
fsync() the containing directory

One of the exiting outcomes (that I debugged in the past year...) if you do not do this dance can be: the destructive rename succeeds, but the resulting file is zero-length.

See also: clarification regarding the robustness of `persist()` · Issue #110 · Stebalien/tempfile · GitHub

Edit: I ran across this bug in ldconfig of all places. Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=18093 ... and the fix: ldconfig: Sync temporary files to disk before renaming them [BZ #20890] · bminor/glibc@999a6da · GitHub

alanhkarp · May 21, 2020, 4:55am

Maybe you need failure atomic msynch.

ikevin · June 7, 2020, 1:16am

Thanks for your advice! I'm now considering serving the file directly (cached HTML string) instead of writing it to the file system for nginx to pick it up.

uberjay · June 8, 2020, 5:56pm

You may not need to go down the crash-consistent filesystem I/O rabbit hole: if what you really need is atomically updated data as observed by nginx, destructive rename is a fantastic option (and possibly the only filesystem-level behavior you can rely on).

What you can try to focus on is trying to guarantee a couple things about your deployment operations:

That they are transactional: there's no way to observe an in-progress (or partially applied) deployment.
That they are idempotent: the fix for a partial or failed deployment is to re-deploy.

Using a symlink to atomically flip from old to new is a great approach... on the other hand, it really depends on what the consumers of the data will do if it's wrong or inconsistent -- if each static asset is viewed in isolation, it might be ok to individually update each file.

After learning about all the ways things can go wrong, I wouldn't blame you for wanting to avoid a filesystem entirely!

system · September 6, 2020, 5:56pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Safe way to write file to disk? help	5	2693	December 5, 2019
Correct way to save a file atomically but without interferring with performance help	8	613	May 15, 2023
Atomic file / write-ahead logging, crate? help	6	789	March 2, 2022
`fs::rename` & overwriting existing file help	9	1920	January 12, 2023
Editing file without overwriting help	7	2309	March 14, 2021

How to write/replace files atomically?

Related Topics