Need help porting a wait free trie from C to Rust

muaddib · November 12, 2022, 12:06pm

For some HPC computing I need to port a wait free trie (but you can think of it as a tree) from C to Rust. The trie is 99% read 1% write, and is used both in parallel-concurrent scenarios , and concurrent only scenarios (single thread with multiple coroutines). The size of the tree is usually between 100kb and 8 mb.

Basically the tree is like:

pub struct WFTrie {
    nodes: Vec<Node>,
    leaves: Vec<Leaf>,
    updates: Vec<Update>,
    update_pos: AtomicUsize,
    update_cap: usize,
}

//.... 
let mut wftrie = WFTrie::new(); 
let wftrie_ptr = AtomicPtr::new(&mut wftrie);

//....

As you can see the trie already uses something very similar to the arena approach that many suggests, by using a vec and storing

When I want to update, I do a fetch_and_add on update_pos. If it's greater than update_cap I return an error (size exhausted), otherwise I'm sure sure my coroutine/thread has exclusive access to updates[update_pos], where I can put my update
Every X updates (if update_pos % 8 == 0 { //update tree} ) a coroutine will clone the tree, apply all pending updates, and then compare_and_swap the wftrie_ptr
When I want to read I do an atomic load on wtftrie_ptr and I can access the tree, taking care of considering also the pending updates.

My questions are:

If I have multiple coroutines having an immutable reference, how can I have one coroutine doing updates? What's the most idiomatic way to translate this to rust?
If a coroutine is still holding a ref to the old tree during an update what happens? Should I replace the AtomicPtr with Arc?
Does this design fit well with the rust borrow checker or am I going to have to use unsafe?

In the case of concurrent only scenario, can I drop atomics without using unsafe?

In the case of concurrent only scenario, can I drop atomics without using unsafe?

Michael-F-Bryan · November 12, 2022, 12:41pm

This doesn't directly answer your question, but you might be interested in watching a stream from Jon Gjengset on porting Java's concurrent hashmap to Rust.

Doing this sort of concurrent programming is tricky to implement correctly, and he goes through some of the unique challenges you'll encounter when it comes to memory management and atomics.

The stream is pretty long (almost 6 hours!), but it shows the process of reasoning about "what is idiomatic?" or getting stuck and figuring things out in all its unedited glory, which I believe invaluable, especially for someone wanting to become an advanced Rustacean.

His Why are my videos so damn long‽ explainer (only 3 minutes) talks about why the streams are structured that way, and he even recommends watching at higher speeds or only watching in smaller chunks until you get bored/lose attention.

muaddib · November 12, 2022, 12:45pm

thank you, I put that in my youtube queue

VorfeedCanal · November 12, 2022, 1:22pm

If you have immutable references, then updates are impossible. Not allowed. Full stop. If you would find a way in safe code — report to developers. It's bug to be fixed.

Pointers and unsafe, obviously.

Anything may happen. But sooner or later you would end up with hard to debug bug.

This may or may not a a solution, but usually not.

It's not even the problem of borrow checker. You are going “under the Rust memory model”. That's impossible in safe Rust. Don't even dream about doing it without unsafe.

It's possible to provide safe API for all that complexity, though.

Possible but unlikely. More precise answer: unsafe is the only way in Rust to temporary break then restore invariants thus you would need some unsafe code.

It doesn't mean that you, personally, need to write such code. You may find it in some crate written by others.

2e71828 · November 12, 2022, 3:20pm

Your write-ahead log needs to be held inside some kind of Cell so that the compiler knows it might change while a piece of code holds a shared reference, and then you use a shared reference to add items to it. You can likely find a crate that implements this sort of append-only shared list efficiently.

You probably want to use something like arc_swap for this, which will let readers keep the old tree alive until they come to a point in the calculation where an update won’t cause problems. Because of the shared update queue, you need to keep in mind that these readers may enqueue updates into the old structure after it’s been discarded by the rest of the program, and you’ll need a strategy to make sure these late updates don’t get lost.

system · February 10, 2023, 3:21pm

This topic was automatically closed 90 days after the last reply. We invite you to open a new topic if you have further questions or comments.

Topic		Replies	Views
Go down a tree without recursion help	3	1146	January 12, 2023
Converting thread-safe lock free queue from C++	4	3830	January 12, 2023
Is it possible to safely build a read-only, thread-safe, bidirectional tree?	15	1745	March 17, 2021
Implement static's trie-tree? help	3	700	July 23, 2022
Graph and multithread help	5	1083	January 12, 2023

Need help porting a wait free trie from C to Rust

Related Topics