Linux x86_64: mmap: correct way to enlarge file, overwrite page

Context: write ahead logs, append-only btrees, database recovery

On x86_64 Linux with mmap, can anyone share 'correct' code (along with guarantees during OS crashes) for:

  1. enlarging a mmap-ed file

  2. overwriting an existing page ?

'correct' here is defined as being able to make non-trivial / useful statements on: if the OS crashes, what pages are:

  • guaranteed to be uncorrupted
  • may have corruptions

Thanks!

I don’t have any code of the kind you requested to share. But I suspect that no one else will be able to either, unless you are more specific about what kind of crashes you care about.

Unless there is some mechanism to prevent this that I am unaware of, I would think that a root user would be able to arbitrarily corrupt the OS code on disk, such that on the next boot some arbitrary condition could cause arbitrary memory to be corrupted arbitrarily, with or without a crash.

Presumably, cases like that are uninteresting. But which interesting cases are left after those are removed?

I am unaware of any mechanism that limits the potential scope of bugs in Linux itself to corrupt memory. But I guess there might be ones I am
unaware of?

Assuming we leave out Linux bugs, and assume that you are running non-corrupted OS code, I am not sure what remaining sources of OS crashes there are.

Theoretically, even things like OOM conditions, for example, would be recoverable because a random high memory process would be killed. Is that the kind of thing you were referring to when you said “OS crashes”?

Valid criticism. Let me try to clariy:

I am operating under the assumption that: files not opened stay safe; files opened in read only mode stay safe; and the only thing that may get corrupted are pages of memory that we open in rw-mode and make a modification to (anywhere on the page).

I am assuming that the kernel is bug-free (unrealistic, but this is part of the model.)

Not worried about OOM. I am only concerned about modified pages of mmapped memory. that when writing them out, the page may be partially written, it may be corrupted during the write, ...

1 Like