Read-optimized HashMap<K, V>

  1. I have a HashMap that I need to refer to from multiple places, so it’s a Rc<HashMap<K, V>>.

  2. Most of the time, it’s for reading. Occasionally, it’s for writing, so now it’s a Rc<RefCell<HashMap<K, V>>>

  3. I can guarantee that this HashMap is only ever accessed from a single thread (entire app is wasm32).

  4. > 99.9% of the accesses will be reads, with < 0.1% being writes.

  5. Given the ‘read-dominant’ nature, is Rc<RefCell<HashMap<K, V>>> the right solution, or is some other structure better?

May be UnsafeCell may little fasfter, but it would be inconvenient to use
and many usafe required.

The much more gain you get via use instead of HashMap,
if you don’t care about DDos and similar issues.

Never ever use UnsafeCell directly, it will almost always run into UB.

1 Like

This general assumption about wasm won’t hold forever, though you can still make that decision for your app and Send and Sync constraints will keep you from doing anything wrong. For an example of upcoming wasm threading, see this parallel raytracer.

As long as it’s not exactly 100% reads, it still has to do synchronization somehow.

If the values are Copy, wrap them in a Cell and you won’t need RefCell to mutate them in place (but you need it if you want to add new keys to the hashmap).

You may amortize RefCell cost by trying to use borrowed &HashMap (or Ref<HashMap>) for as large sections of your program as possible.

If your program can be divided into phases where you read-write and then only read the hashmap, you can try unwrapping it from Rc/RefCell before the read-only phase.

  1. I agree there has to be some cost.

  2. No, I don’t see a way to divide into stages. The XY problem here is I’m implementing a “global env” (which I need to do reads for symbol lookup, and writes when a new var/function is defined).

  3. What I am wondering is if there is some structure that is read-optimized, where the read route is made fast (at the cost of making writes slower).