Passing an Arc to a function then returning it as function result

I seem to have got into a style of having functions that take an Arc (moved in), and then return it as the function result, in order to avoid cloning.

It would also be possible to pass it by reference, but my guess is the extra indirection would result in some minor inefficiency.

But still, this style leaves me feeling a little uneasy. The function isn't computing anything, so having it return simply what it was given in the first place seems a little odd, and perhaps confusing.

Questions:

(1) Does it even make any sense in the first place? Is it actually more efficient?

(2) Is there another way of doing this without incurring any performance overhead?

( There is an example here )

Using the same logic as &String arguments vs &str, could you just pass in a &T reference instead of the Arc<T>? You still have a single level of indirection, but by dereferencing the Arc before passing it in, you can avoid the cost of incrementing the reference count.

That said, looking from the outside this feels like a premature optimisation. The atomic increment is probably negligible considering your function is already locking a mutex, which involves blocking when the lock is contested and even the uncontested case will probably require an atomic load.

2 Likes

It turns out in this case the mutex lock is never contested, because there is a higher level lock. I guess I could have a "cheating unsafe dummy Mutex" or something!

For sure, the optimisation is extremely minimal, but, well... why go slower when you can go faster?

Opportunity cost. The effort you’re spending to eke out a tiny performance improvement here can likely be spent more effectively on improving quality¹ elsewhere.

¹ For whatever metric of quality you care to use: Mean performance, p95 performance, reliability, feature completeness, etc.

5 Likes

I like to refer to Donald Knuth's full comment about premature optimisation:

Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.

In this case, I'd say the performance benefits don't outweight the impact on ergonomics. That said, my priorities when programming tend to be ergonomics/maintainability and complete correctness (i.e. covering every edge case and providing detailed errors), with performance coming in 3rd place.

I remember back when actix-web used a similar sort of "cheating unsafe dummy Cell" to avoid the overhead from RefCell (among other things)... When it was brought to the public's attention and people looked closer at how the project approached unsafe, the project author was absolutely crucified by the community. I'm not going to link to the original Reddit/GitHub threads because they're pretty toxic, but Steve Klabnik wrote up some reflections on the event.

4 Likes

I think there will always be tension between the desire for safety versus the desire for speed.

I suppose a possible compromise would be a Mutex that panicked if there was contention. That would not be unsafe. I am not advocating for this though, I think the wide-spread adoption of parking_lot is unfortunate.

1 Like

Coming back to the original question made me think a bit more about how "passing by reference" is implemented. My assumption has always been that the parameter is stored in memory ( where the original & operation occurs ) and the address of that memory is then passed around.

Some experiments seem to indicate this is how it works in practice with the current compiler.

However, I don't see why (under suitable conditions) pass by reference should not be implemented by "moving then moving back", as my example function does explicitly, as an optimisation to avoid storing the original Arc at all and taking an address. I don't think the current compiler does this optimisation though.

2 Likes

Detecting those "suitable conditions" likely has some subtle edge cases that make the analysis difficult for the compiler. In particular, shared references are allowed to alias and the move-then-move-back strategy risks an inconsistent state being visible elsewhere in the program.

If your function gets inlined and you're passing a reference to a locally-owned Arc, I suspect that the compiler will optimize away the reference-then-dereference operation.

I can see for mutable references, where there are multiple copies of the reference, it could get messy as they would all have to be updated when the underlying value is mutated. [ Edit: but hold on... maybe the rules don't allow that! ]

But for an immutable reference, if the referred to value is "small" (such as an Arc), it seems valid to pass it around "by value". The "borrowed" value referred to doesn't change, so why store it in memory and introduce an indirection? Perhaps I am missing something.

Note that if the function accepts &T, you have arc: Arc<T>, and you pass &*arc, then at runtime this will be simply "take the pointer which already exists inside the Arc and use it", no "reference-then-dereference".

I cannot use that in my example, because the Arc may be cloned ( in the uncommon branch ).

    fn used(&mut self, p: PageInfoPtr) -> PageInfoPtr {
        let (mut pos, counter) = {
            let mut p = p.u.lock().unwrap();
            p.counter += 1;
            (p.heap_pos, p.counter)
        };
        if pos == usize::MAX {
            pos = self.v.len();
            self.v.push(p.clone());
            self.move_up(pos, counter);
        } else {
            self.move_down(pos, counter);
        }
        p
    }

I did make a minimal test, and the disassembly appears to show a difference in the code.
[ Edit: but perhaps not in release mode....I forgot to check that! In release mode it looks like the code is identical... So I think that answers question (1), it isn't more efficient, in release mode the code will be identical ]

Arc moved and moved back:

Arc passed by reference:

Rust's "immutable" references are really just shared references. Your logic holds only when there are no UnsafeCells in the picture, but Arcs often contain Mutexes— From the type system's point of view, there's no way to tell that the Mutexes are really on the heap and won't be affected by moving.

The compiler also needs to ensure that no reference to the temporarily-moved value is kept beyond the point that the value is returned to its proper place, as that would leave a dangling pointer.

It's certainly possible to make the compiler do these checks, but validating correctness will be a significant amount of work. This is straying into IRLO territory, though; you should consider bringing up this idea over there.

Per my post above, it turns out that in release mode (which I forgot to check) the compiler does appear to do the optimisation I was coding "by hand". So I think that clears the matter up, and I will revert to passing by reference.

4 Likes

Interior mutability isn't reflected in the type system from a user's perspective, but the compiler does have perfect knowledge of which types have interior mutability. Optimizations like this are exactly why all interior mutability in Rust has to go through UnsafeCell to be sound.

2 Likes