Curious about Arc implementation details

wyang5 · March 6, 2024, 10:03pm

ArcInner is defined as:

#[repr(C)]
struct ArcInner<T: ?Sized> {
    strong: atomic::AtomicUsize,
    weak: atomic::AtomicUsize,
    data: T,
}

Why is data T and not *T, a pointer to a T elsewhere on the heap? I could be wrong but I believe C++'s shared_ptr uses the latter approach. As I understand it, when the strong count hits zero, drop_in_place is called on data. Assuming there are weak pointers keeping this ArcInner alive, no memory is freed (unless data's destructor frees stuff elsewhere). So what's the point of calling drop_in_place, to comply with Arc semantics?
Why is data not declared as the first field? It seems to me that there is some unnecessary complexity in calculating alignment and byte offsets in Arc::from_raw. If data were the first field, wouldn't it be possible to simply cast *mut T as *mut ArcInner<T>?

kpreid · March 6, 2024, 10:12pm

Why is data T and not *T , a pointer to a T elsewhere on the heap?

If you put a pointer there, then every access to the data would be require following two pointers: one to the ArcInner and then another to the T. That'd be significantly slower.

So what's the point of calling drop_in_place , to comply with Arc semantics?

Well, the T has to be dropped some way, or there would be a memory leak. If you mean some other implementation of dropping the T, can you say more?

Why is data not declared as the first field?

It can be dynamically sized, and the compiler requires such fields to be the last field. But note that there isn't necessarily any significant added cost to having the ArcInner fields at the front; if you have Arc<MyStruct> then accessing fields of MyStruct is itself adding a constant offset to the pointer, and the optimizer will easily be able to combine the two offsets into one.

wyang5 · March 6, 2024, 10:30pm

I thought we could just wait for the weak pointers to be dropped, but I just realized that makes no sense as it brings back the problem of reference cycles.

It can be dynamically sized

Ah, totally forgot about that.

Thanks!

CAD97 · March 6, 2024, 11:19pm

Furthermore, consider that even if Arc stores a pointer to data instead of a pointer to the data block, offsets from to the other are still required, just on clone and drop instead of {into|from}_raw and deref.

Also, fwiw, C++ shared_ptr often isn't Arc<*mut T>, but more like (Arc<()>, *mut T). The control block is allocated separately specifically because you can make a shared_ptr from a new T. But if you directly create the shared_ptr, it does use an inline allocation like (Arc<T>, *mut T); that this remains somewhat reasonable is because free and delete[] mandate the capability to deallocate without knowing size.

Topic		Replies	Views
Weak<T> anda dangling pointer	15	297	May 2, 2025
Why does Arc use one contiguous allocation for data and counters? help	10	751	September 20, 2024
Could `Arc<[T]>` from `Vec<T>` be optimized to remove the copy?	11	571	April 13, 2026
Why is size_of<Arc<str>> == 16? help	11	3233	November 28, 2021
Arc::increment_strong_count design question (cross-post) help	1	104	April 22, 2026

Curious about Arc implementation details

Related topics