Arc - zero cost abstraction - optimization

Hi,

I was playing with some code in rust, and was trying to see how much overhead coming from Arc, especially when it should be zero cost abstraction.

In first example, it is obvious that Arc is useless in this case, and I was checking if Rust will do Zero Cost Abstraction, however here what I’ve got in assembly:

||movq|$21, 16(%rax)|
||movq|$22, 24(%rax)|

||movq|%rax, (%rsp)|
||movq|$21, 56(%rsp)|
||leaq|64(%rsp), %rax|
||movq|$23, 64(%rsp)|

Obviously first two instructions are redundant here. While same thing without Arc work nicely.

Is there a way to setup compile time optimization for such scenarios?

use std::sync::Arc;

struct S {
    f1: u64,
    f2: u64,
}

fn main() {
    let s1 = S { f1: 5, f2: 6 };
    // print_s(&s1);
    // s1.f1 = 8;
    //   print_s(&s1);

    let s2 = S { f2: 12, ..s1 };
    print_s(&s2);

    let s3 = Arc::new(S { f1: 21, f2: 22 });
    let s4 = S { f2: 23, ..*s3 };
    print_s(&s4);
}

fn print_s(s: &S) {
    println!("S: {}, {}", s.f1, s.f2)
}

(Playground)

Output:

S: 5, 12
S: 21, 23

Errors:

   Compiling playground v0.0.1 (/playground)
    Finished `release` profile [optimized] target(s) in 0.60s
     Running `target/release/playground`

on the memory aspect, Arc, and any reference-counted smart pointer types, including C++'s shared_ptr, can NEVER have zero memory overhead, since the bookkeeping information (called "control block" in C++, mainly the strong count and weak count) must be stored somewhere.

rust's implementation stores them along side the data (a.k.a. colocated), and Arc is a thin pointer [1], while C++ uses a separate allocation for them, and C++ shared_ptr is a wide pointer. (although C++'s make_shared() guarantees a single colocated heap allocation, std::share_ptr still stores the data pointer and control block pointer separately).

for the runtime overhead, dereferencing an Arc, i.e. convert from Arc<T> to &T just needs to offset the pointer with a small constant, which is basically a free operation in modern CPU ISA. and once you get the data pointer, it's indistinguishable from any data pointers of the same type.

the only real runtime "overhead" of Arc is when you update the reference counter by manipulating the ownerships, especially in contentious concurrent scenarios. but you are not doing this in critical paths anyway.

I don't know what's your expectation or definition for "zero cost abstration", but Arc is practically zero cost to me, as a heap allocated smart pointer.


  1. ignoring stateful allocators ↩︎

3 Likes

and there is even work going on to remove that. Make `Rc<T>::deref` and `Arc<T>::deref` zero-cost by EFanZh · Pull Request #132553 · rust-lang/rust · GitHub
This PR changes the layout so that Arc (and Rc) point to T, so dereferencing them is actually zerocost. Accessing the control block wouldn't be zerocost anymore, but the PR makes the assumption that this isn't done as often.

2 Likes

That's the case if have a Arc<T>. If you have a &Arc<T> that's basically a pointer pointing to a pointer, so you have to perform a read from the first one in order to get a &T.

1 Like

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.