Create vector of AtomicUsize, etc

vec![AtomicUsize::new(0); word_count]

is not allowed, because AtomicUsize is not cloneable. What's the right way to create a Vec of AtomicUsize?

Also, what's the proper way to find out how many bits are in an AtomicUsize regardless of platform. And is there a way to get the largest atomic u that works fast on the current platform?

(Use case: allocator for slots in a large Vulkan descriptor array.)

you can use Vec::resize_with() like this:

let mut v: Vec<AtomicUsize> = Vec::new();
v.resize_with(count, || AtomicUsize::new(0));
// or alternatively, AtomicUsize implements `Default`, so you can do:
v.resize_with(count, Default::default);

AtomicUsize has the exact same size of usize, so you can use usize::BITS

4 Likes

Since new is const, we're in this funny situation where arrays do work[1]:

[const { AtomicUsize::new(0) }; 5]

I thought Vec would try to clone it and fail, but it actually errors in the macro matcher, which means it should be possible to make this work without breaking anything:

vec![const { AtomicUsize::new(0) }; 5];

error: no rules expected the token `const`
 --> src/main.rs:4:10
  |
4 |     vec![const { AtomicUsize::new(0) }; 5];
  |          ^^^^^ no rules expected this token in macro call
  |
  = note: while trying to match end of macro

  1. assuming the length is const, which means this probably doesn't help you ↩ī¸Ž

2 Likes

You can also create an iterator and collect it:

use std::iter::repeat_with;
use std::sync::atomic::AtomicUsize;

fn main() {
    let items: Vec<_> = repeat_with(|| AtomicUsize::new(0)).take(5).collect();
    
    for item in items {
        println!("{item:?}");
    }
}
2 Likes

This function compiles down to a single call to __rust_alloc_zeroed, so it is more efficient than anything else proposed in this thread.

pub fn zeroed_atomic(n: usize) -> Vec<AtomicUsize> {
    vec![0usize; n].into_iter().map(AtomicUsize::new).collect()
}

view on godbolt

12 Likes

There's a proposal to add support for that: https://github.com/rust-lang/libs-team/issues/484

4 Likes

I'd like to understand why this is considered faster than my approach.

My assembly is a bit rusty (pun intended).

Your approach calls __rust_alloc to create uninitialized memory, and then uses memset to set all of the bytes to zero. My approach uses __rust__alloc_zeroed to directly ask for zeroed memory. The latter is often faster because it can avoid zeroing the memory if it knows that the memory is already filled with zeros.

6 Likes

I believe this is true only for largish allocations (128K in glibc, not sure about jemalloc) where the allocator requests pages from the OS, which, in turn, maps a single page of zeroes to the entire block.

This could be detrimental in this case, as when you write to this block it will trigger a page fault. This may introduce jitter which, I assume, is not what you want when you use atomic.