Switch from mpsc to broadcast-channel or use arc+rwlock to share data?

I'm unsure which way to go:
I've got this data_sender.loop that owns the receiving part of a mpsc-channel and caches some of the received data in a HashMap.

Now I'm adding another thread that handles http requests and should respond with data stored in that HashMap. I'm unsure if I should wrap that HashMap in an Arc to achieve this or alternatively switch my mpsc channel into a crossbeam one and have the http_server handler store it's own copy of the data.

I guess the second option would be faster because no locking occurs, but will require more memory.

Does the crossbeam channel work the way I think it does, so that every receiver gets every message or does only the first receiver trying to read get it? (the second option would be useless for my situation and force me to go the Arc route instead..)

It's the latter. It is not a broadcast channel.

Note that cloning only creates a new handle to the same sending or receiving side. It does not create a separate stream of messages in any way:

See the example under that paragraph here:
https://docs.rs/crossbeam/latest/crossbeam/channel/index.html

Of course you can send the data twice yourself using two channels.

If you don't send it twice, you'll have to use an Arc<Mutex<...>> not just an Arc. Or an Arc<RwLock<...>>.

And then there is no point to sending a message to store an entry in a HashMap, you might as well just store it directly and avoid the message. If that's the only purpose of that message.

The main thing to watch out for with async is lots of contention on the Mutex or RwLock. If that blocks a lot, the async runtime in that thread will block as well, and that should be avoided as you probably know.

1 Like

oh wow that was a fast response, was reading up on it after creating my post and stumbled upon this alternative that does support multiple senders and receivers where each receiver gets all messages: GitHub - schets/multiqueue: A fast mpmc queue with broadcast capabilities

I think I prefer this option to the Arc<RwLock> path, because I don't want slow or many clients of my http_server component to keep the data locked and prevent the data_sender to do it's thing.

ps.: I've actually written Arc<RwLock> in my first post too, but when you don't escape the < & > chars they and what's between them don't show in the resulting post :sweat_smile:

1 Like

Tokio also has a broadcast channel if you're interested. And there is async-broadcast, and its doc has a comparison of different broadcast channels under "Difference with other broadcast crates".

One final note: If the contention is not very high on a shared HashMap, that's usually a good option since it is simple and efficient because normally the HashMap is locked for a very short time. It seems to be commonly done, but the decision to use it does depend on the frequency of the HashMap accesses.

2 Likes

Thanks for all your input! I've decided to go with an Arc<RwLock> in the end and cache the generated http output string on the http_server thread and only regenerate + re-access the RwLock for requests that arrive more than a second later.

1 Like