What advantages does
&[AtomicU8] provide in comparison to opaque slice type proposed here?
What advantages does
Well, as discussed upthread, using volatile loads is unnecessary; atomics should be enough. Thus there’s no need for a new type:
&[AtomicU8] already has the required semantics.
Also, as I mentioned in my last post, a “possibly aliased byte slice” type can be useful even without
mmap, if you just want to share a buffer between different objects in the same program. So if we want to add convenience functions, e.g. a
memcpy wrapper that copies to an
&mut [u8], it would be better to have them work on a more general type rather than something
AtomicU8 with relaxed operations still work properly if another concurrent thread (or in this case, process) modifies the memory as
*mut u8 without using any atomic operations whatsoever ?
If a concurrent process is doing unsynchronized writes to the memory, it seems that
&[AtomicU8] is roughly equivalent to
&[UnsafeCell<u8>]. At least the optimizer will not assume that the the data is immutable while borrowed, but data races are still possible and still UB according to the rules. I’m not sure how this would manifest in practice, however.
If that other thread is in the same process, then that other thread is just wrong and we got UB – this is not about
mmap, just about normal rules for memory shared across threads.
If that other thread is in a different process, I do not think there is any potential for problems in our process. We might observe some strange things because the other process is essentially misbehaving and causing UB internally, but UB stops at process boundaries – so for us, these are just “strange writes”. The same writes might actually occur from a well-behaved process.
if you care about correctness, of course you need to ensure to use proper atomic reads and writes in all processes. But just from a UB perspective, it is impossible for one process to cause UB in another – or rather, it better be. (As usual, this excludes
root going ahead and modifying
/dev/mem and similar shenanigans…)
If that other thread is in a different process, I do not think there is any potential for problems in our process. We might observe some strange things because the other process is essentially misbehaving and causing UB internally , but UB stops at process boundaries
If we would be reading from a i/o port instead of a
volatile ptr read and write would be enough to avoid UB. Do we really need to use relaxed atomic load-stores to avoid UB here in a single threaded process? If so, why?
I think volatile reads would also be sufficient to avoid UB in a single-threaded application, you just do not get atomicity guarantees, i.e. your reads could be torn. And you have no control over visibility w.r.t. other processes modifications, i.e. no memory orderings.
Personally, I think that atomicity is pretty useless when we talk about accessing byte buffers here, i.e. I am not sure being able to copy bytes out of
&[AtomicU8] will give me anything useful from an
mmap interface usability perspective. Either I get a consistent snapshot of the whole file contents or I could read from a completely different version with every byte I load and have to plan accordingly, i.e. never load the same offset twice and assume it has not changed in the surrounding code, i.e. am reduced to a
fgetc-like interface and hence am probably better off with
read-like from a usability point of view, especially for stuff like
This why I do not think that
&[AtomicU8] is a useful type for
mmap to return. If you want to use shared mappings to communicate with other processes, you would probably start with
*mut u8 and transmute that into atomic types that implement primitives like spin locks or mailboxes.
We don’t have to. But since relaxed atomic load/stores are the same as regular load/stores at an assembly level, we might as well. Compared to volatile load/stores, they’re theoretically more optimizable, and at worst produce the same assembly.
Edit: That is, whereas both you and adamreichold seem to be assuming that volatile accesses have less overhead or lower semantic burden than atomics, it’s really the other way around.
So why are atomics enough as opposed to this requiring volatile reads and writes ? How does reading from memory that can change below you differ between an
mmaped file and an I/O port ? (or why does reading from an I/O port need a volatile read instead of a relaxed read?).
As discussed above, it is not as simple as stronger/weaker or more/less overhead. Both access types have different semantics. They both prohibit certain compiler optimizations but enable others. In the single-threaded case, we are basically interested in making sure that any load from the mapped region has to be explicit which both types of access should provide.
In think the main point that it is not actually the file content that is somehow mapped into the process address space, i.e. you load will not really turn into a disk I/O, but rather pages in the kernel’s page cache will be accessed, i.e. ordinary system memory. In the case of an I/O port you actually want your load to become a request to some peripheral (or bus connecting that peripheral) and hence the volatile access to force the system to actually access the port.
Any changes to the file contents will only manifest in the memory mappings when the kernel changes the pages residing in its cache, i.e. normal system memory is changed, which could very well all be happening e.g. in the L2 cache without really hitting the memory subsystem therefore the level of memory consistency provided by atomic access is sufficient.