DashMap Vs. HashMap

If you were going to store

Arc<RwLock<HashMap<String, Thing>>>

And Things are immutable, would you go for DashMap instead? Why - since a RwLock wrapping a HashMap is easy to deal with?

I'm just looking for opinions. Maybe there is something more ergonomic about DashMap usage but I'm not really seeing it.

(On a related note, I'm not really finding a viable lock-free Map implementation - it seems lockfree is not maintained.)

You would use Arc<DashMap<String, Thing>> since DashMap uses RwLock internally:

DashMap tries to be very simple to use and to be a direct replacement for RwLock<HashMap<K, V>>.

In other words, DashMap has its own interior mutability so you don't need to wrap it with a lock of any kind.

I understand that. (I said, "...since a RwLock wrapping a HashMap is easy to deal with...")

My point is that wrapping in a RwLock is trivial. So trivial I'm not sure I want to add another dependency to the system just to avoid that. My question is, are there other benefits to DashMap (better performance?) I'm not aware of.

The point of DashMap is that it is sharded. There is one RwLock per shard, so lock contention is reduced.

Or are you saying the entire map is immutable? If so, you can just use Arc<HashMap<...>>.

4 Likes

I see. So that is indeed an impl detail that is going to improve performance (in a high-contention environment anyway).

I think I'll use it

(No, the Map is not immutable - the values are.)

1 Like

Take note of all of the "Locking behaviour" warnings in the dashmap docs about deadlocks. Its API is prone to caller errors that can make it very difficult to use in complex situations. If you run into trouble, scc::HashMap is a good alternative. (It still has some deadlock warnings, but much less.)

If you need a deadlock-free concurrent map, contrie and flurry are decent.

Also, here are benchmarks for various concurrent map implementations: xacrimon/conc-map-bench

8 Likes
pub fn insert(&self, key: K, value: V) -> Option<V>
Inserts a key and a value into the map. Returns the old value associated with the key if there was one.

Locking behaviour: May deadlock if called when holding any sort of reference into the map.

!!!

Really?

Who can tolerate a weakness such as this? Not me.

Thank you for that warning.

1 Like

In many cases you only access one element at a time, so this isn't a problem. But yes if you access multiple items at a time (through references, not copied) then it is a blocker.

1 Like

I suppose so.

(My basic reference is good 'ole Java ConcurrentHashMap which is pretty hard to break.)

scc looks very good but does describe itself as "optimized for highly parallel write-heavy workloads." Very much not our use-case.

This is a trading system, and it has to be 100% correct. I think the "stock" HashMap with a RwLock will do just fine.

(As usual, though, this forum is a great resource.)

After the nth time of encountering a deadlock with dashmap I made this repo for finding some deadlocks by forking dashmap and removing the send and sync traits from a few data structures.

There's an open issue discussing some of these issues: Ways to debug deadlocks · Issue #243 · xacrimon/dashmap · GitHub

If you end up using dashmap I recommend reading the problems others have encountered with it: Beware of the DashMap deadlock - DEV Community

Most of the problems can be avoided by just being aware them and of the implication of the comment:

May deadlock if called when holding any sort of reference into the map.

This means any type ending with *Ref in that crate. Never hold any of those structs over an await point, as that will eventually lead to a deadlock.

4 Likes

This is generally good advice with any container ref type! Any wait, not just await, either, thread joins, socket io, spin locks..., so long as isn't provably going to be released by some internal mechanism that doesn't depend on the rest of the world.

That warning should really be clearer, it implies that there's a chance that you'll insta-deadlock if there happened to be any other ref held at the same time, rather than a more general it may lock waiting for some other ref to free, which you can turn into a deadlock.

2 Likes

Ideally the API could be designed to only allow accessing a single map entry at a time. Even a runtime check for that (only one guard allowed per thread at a time) would be better than a deadlock.

1 Like

AFAIK your sound options are to use a handle or to use a callback mechanism. People tend to not like using the latter, so :person_shrugging:

1 Like

You can also use values of Arc<T> in the map and inexpensive clones instead of holding references. I don't consider the design a showstopper. It just doesn't gracefully handle naive uses.

1 Like

This is no different than using a RwLock, just more implicit. You'll also end up in a deadlock if you try to mutably lock the RwLock (e.g. for inserting) while holding any reference into it.

Ultimately DashMap is just a fancy Box<[RwLock<HashMap<...>>; N]> with all its disadvantages.

1 Like

Do you mean from the same thread?

I guess you must.

The classic deadlock situation, of course, is thread 1 holds A and wants B and thread 2 holds B and wants A.

If you're talking about a single thread holding a RwLock reference and then blocking on a mutable lock, I agree that's just simple operator error.

What you may be missing is that the RwLocks are not visible to the user of the API. There is one per shard and there is no way to know whether two keys are in the same shard or not.

So within a single thread, if you get a ref to key X and, while this ref to X is still active, get a ref to key Y, that will self-deadlock if:

  • X and Y happen to be in the same shard
  • either X ref or Y ref is a mutable ref

This is poorly documented. It isn't discussed up front or in any general way that I can see. You have to read the Locking Behavior note for each method that accesses a key, for example get:

Locking behaviour: May deadlock if called when holding a mutable reference into the map.

But to understand it generally you really have to understand that each shard has a separate RwLock. That's not a complex concept, just poorly documented.

2 Likes

No, I was not missing that. I had the same thought with regard to DashMap: the real danger is due to the fact that the implementation is hidden.

I was just speaking about the general RwLock deadlock possibility.

1 Like

It doesn't matter. It will have the same result, both from the same or a different thread, and with both RwLock and DashMap.

It's not clear to me what "it" means in when you say "It will have the same result". There were two different things (deadlocks due to reversed lock order, and self-deadlock) that they mentioned.