How expensive is Arc::make_mut?

I am thinking of trying to avoid calls to Arc::make_mut , as it appears to me that each make_mut call involves inter-process synchronisation operations, but also I am wondering if it is actually worthwhile and whether this makes sense. Has anyone tried this and done performance measurements?

Arc::make_mut makes no inter-process calls because Arcs are not shared between processes. Arcs are shared between threads, but it also makes no cross-thread synchronization (i.e. it doesn't lock). The main logic for make_mut uses atomic comparisons to check whether any other threads can view the allocation, so under most circumstances there's no overhead. If you're constantly hammering the atomics, however, underlying memory synchronization will cause some slowdowns, but that's not an issue for most applications.

3 Likes

Yes, and I think atomics can be quite slow, although in the case of Arc::make_mut perhaps this is not true ( in the case where there are no other Arc pointers to the same allocation ).

Well what exactly do you expect? Atomics are the fastest kind of synchronization you can get (and if you are using threads, you need synchronization, as data races are UB). If you need make_mut(), call it and move on. If you find by measurement that you are spending too much time in make_mut, then you'll need to rethink the architecture to inherently require less synchronization in the first place; the solution isn't going to be some sort of magic that still synchronizes but somehow faster than atomics.

3 Likes

OP, are you are that (from Arc::make_mut documentation):

If there are other Arc pointers to the same allocation, then make_mut will clone the inner value to a new allocation to ensure unique ownership.

If you see that Arc::make_mut is being slow it might be that you're using it on an Arc that's not unique and hence it must spend time cloning the inner value.

5 Likes

The change I am trying is to call make_mut less, so call make_mut, then use the &mut say 50 times (might be typical), rather than calling make_mut 50 times.

It does mean re-factoring my code somewhat, but hopefully will work out nicely!

Great example of why “magical languages” don't work. “Magical language” may use make_mut and &mut where feasible, but it couldn't “refactor your code somewhat” because very often such refactorings depend on things not include in the code itself, they are in the “problem specification” document which, more often than not, only exist in the developer's head.

Note, BTW, that make_mut is about 15-20 times more expensive than single dereference of &mut (more expensive for some old architectures like Piledriver), which means that there are limits of how much you may win from transition from make_mut to &mut. But that's definitely a win developer may organize, but “magical language” couldn't.

Because conversion from make_mut may make you code both significantly faster and significantly slower! Because “take make_mut once, use &mut 50 times reduces number of expensive take_mut operations, but it also makes “critical sections” longer and increases changes of hitting contention case (where expenses of take_mut go through the roof).

AGI may eliminate the need for the strict, formal, programming languages as it would be able to take vague description of program in English or Klingon and would write program in Rust or some other strict language, but that's not even remotely close to how your “dream language” was supposed to operate, isn't it?

1 Like

Well it is done, I have a new struct MutPage which is used to avoid large numbers of calls to make_mut. Whether it will actually make any measurable difference to performance is pretty doubtful, my guess is other factors dominate. I might try to measure and see, will probably need to make a contrived test to demonstrate it.

Edit: I tried to do some measurements.... I am not at all sure about my methodology, but it seemed like make_mut takes about 11 nanoseconds on my PC. Whereas a typical instruction instruction takes maybe 0.5 nanoseconds. I think!!! This may all be very wrong. Anyway, it seems it is very fast in absolute terms, but relatively expensive compared to a typical operation. As above, it is probably impossible to measure any speed-up without some sophisticated benchmarking tools. It would probably be a fraction of a percent.

Well, it will have to clone the inner value the first time a "page" is modified, but typically it will then make a bunch of mutations, although the 50 I quoted earlier is probably more than the average, I guess in the average case it would be more like 10 mutations or something, if a single record is being added to a page.

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.