Parallel increment to array

What is the most efficient way to atomically increment cells in an array across many processes (lots of concurrency), according to this hypothetical use case:

Imagine a directed graph with 2^32 elements. Node zero gets "infected". Each node with an edge from an infected node gets infected. For each node, you want to calculate count of the number of incoming edges that were infected. A possible algorithm is to keep a list of nodes to visit, and on each visit, add the list of outgoing nodes if not already visited. Bottom line: many threads are incrementing cells in an array. You can suppose an array of 2^32 vecs stores the outgoing edges and another stores the incoming edges, but feel free to imagine a more efficient storage if you so choose.

If possible, it would be great if you say what rust keywords/functions you should use.

Why do you want to calculate a count? Are you printing it? Are you trying to get such information on every iteration or just the final state? Can you give a small example graph and its output?