How to access vector from multiple threads without mutex?

How can I access a vector from multiple threads without a mutex? I know what I am doing and I don't have any meaningful race condition.
For example here is my C code using OpenCIlk for multithreading?

void graph_trim(graph g) {
  bool *has_in = calloc(g->v, sizeof(bool));
  if (!has_in) {
    fprintf(stderr, "Couldn't allocate has_in array\n");

  bool *has_out = calloc(g->v, sizeof(bool));
  if (!has_out) {
    fprintf(stderr, "Couldn't allocate has_out array\n");

  // run for in parallel
  cilk_for(size_t v = 0; v < g->v; v++) {
    for (size_t j = g->in->com[v]; j < g->in->com[v + 1]; j++) {
      if (!g->removed[v] && g->in->unc[j] != v) {
        has_in[v] = true;
        has_out[g->in->unc[j]] = true;

  cilk_for(size_t v = 0; v < g->v; v++) {
    if (!has_in[v] || !has_out[v]) {
      g->removed[v] = true;
      *(g->n_trimmed) += 1;

If you merely want to run some code in parallel on every element of an array/vector/slice, you can use Rayon's parallel iterators.


That could be a race condition depending on what values g->in->unc[j] takes on.

The increment is definitely a race condition.

In general, prefer iteration to indexing, and see if you can get this to work with rayon's par_iter_mut.

Incidentally, anything involving pointer-heavy graph-based data structures is a painful way to attempt learning Rust.


For every "I know what I am doing", there's a corresponding "well, actually". :grin:


The first race you mentioned is meaningless (as I said "I don't have any meaningful races") because I don't care which thread is going to turn the has_out[g->in->unc[j]] to true.

For the second race there is actually no race condition because I am using OpenCilk's reducer there.

full context of this code can be found here

That doesn't matter. Race conditions are undefined behavior (in both languages), regardless of whether you care about them. (I.e., if you have a race condition, the compiler has the right to turn your program into code that launches the nukes, etc.)


You can use atomics to avoid Mutex without UB for simple things like booleans.


So you are telling me that there is a case where this program prints i=0 instead of i=1?

#include <pthread.h>
#include <stdio.h>

int i = 0;

void *thread_function(void *arg) {
  i = 1;
  return NULL;

int main() {
  pthread_t thread1, thread2;

  pthread_create(&thread1, NULL, thread_function, NULL);
  pthread_create(&thread2, NULL, thread_function, NULL);

  pthread_join(thread1, NULL);
  pthread_join(thread2, NULL);

  printf("i = %d\n", i);

  return 0;

Yes, that'd be a conforming outcome (and so would be literally anything else). Given that the assignment i = 1 in thread_function is racy, the compiler can assume it never happens as-is (without synchronization, e.g. as in your code). So it might as well remove it altogether.


I will be glad if you can tell me what optimization flags to add to clang to get this code not print i=1..

You probably can't get clang to optimize that program to not print i = 1, but the point is that if clang did so, then it would not be a bug in clang.

The correct way to do that kind of thing is to use relaxed atomic operations instead. Those compile down to exactly the same instructions as what you're currently doing, and it would not be a data race.


"Technically UB but I know it compiles ok [asterisk today on my machines with my compiler]" is a stance which is culturally rejected across practically all of the Rust community. UB is defined at the language level and UB (and unsoundness, where UB is possible to trigger) is to be avoided.

Morever, if you have a Rust program that does not contain unsafe but contains UB (even if "just" technically), that means

  • One of your dependencies has unsound unsafe, and it's their fault, or
  • The compiler has a bug, and it's the compiler's fault

As a corollary, there is no way to have "harmless UB" (like a "meaningless data race") in safe Rust.

(And no one's going to want to "help" you do so with unsafe, due to the cultural rejection of "if it compiled ok this time we're good". [1] We'd rather help you write sound code.)

  1. Also there's a lot less people knowledgeable of how to achieve such things reliably, because almost no one is interested in doing so. ↩︎


When I'm using lock I guess I'm locking the entire vector. Could I lock for a specific index of the vector, so only 2 threads will wait only if they try to write to the same place?

use std::sync::{Arc, Mutex};
use std::thread;

fn main() -> std::io::Result<()>{
    let max_threads: usize = thread::available_parallelism()?.into();
    let has_in: Arc<Mutex<Vec<u8>>> = Arc::new(Mutex::new(vec![0; max_threads]));

    let mut threads: Vec<thread::JoinHandle<()>> = vec![];
    for i in 0..max_threads {
        let has_in = Arc::clone(&has_in);

        threads.push(thread::spawn(move || {
            let mut has_in = has_in.lock().unwrap();
            has_in[i] = 1;
            println!("{:?}", has_in);

    for thread in threads {


Benign Data Races: What Could Possibly Go Wrong? | Intel® Software (


If you don't want to use rayon, you can use thread::scope here to avoid the locking, and I'd chunk up the work instead operating on individual u8.

With the version where you send the Arc, it works effectively serially due to the locking, which was probably your point.

Atomics for has_out still looks like a better fit to me for your OP.


Why do I have to drop the lock at the end? Isn't this done automatically?

It's a best-practice thing (or at least a habit of mine to be explicit about the scope of locks). The lock does drop automatically (like any other variable) at the end of the function, and in this case it didn't really matter. But in a more complicated function it can matter and/or be less obvious, e.g. as described in this issue or the last example in the documentation.

Note that you don't really need the Arc+Mutex for that. In fact you're just wrapping in a Mutex, then immediately calling lock to get a reference which you'll hold until the end of the program.

1 Like

Is it ok to have Vec<AtomicBool>?

Yes, it is. This is more in line with what other commenters have been suggesting.

1 Like