HashMap key vs Vec member match on iter

I want to find some row in Vector where its member match string.

Not sure what is better in performance context:

  1. Build some HashMap index then simple get some key
  2. Create Vec, then iterate its records until member match condition with ==, then break

This all in string comparison context.
What do you think?

I suppose that Vec is better because does not require hash operations on build. But not sure how does == operator works, maybe its doing same things.

If you're searching once, then building of any index will take longer than scanning the Vec.

If you're searching many times, then HashMap will likely be faster.

String equality is fast in Rust (compares lengths and bytes directly).

If you have lots of strings to compare and the strings are from a finite set, check out string interning.

1 Like

Many times on a small set of items can sometimes be slower with hash. It is more likely to be slower than Vec-like structure if it is feasible for OP to put all keys into one long string, but if either all keys are allocated at once or all strings have different lengths there are good chances the result could be faster even with an actual Vec: in first case (keys allocated at once) allocator can put them close together, and in second case (lengths are different) comparison function will not go farther than comparing lengths which are stored in a continuous region of memory used by Vec (as length is stored in String itself) and not in memory pointed to by String.

Many times on a large set of items will almost definitely be faster with hash however, even if all string lengths are different (though if it is known they will be it may be wise to use string lengths as keys, hashing usize should be faster).

It is very hard to answer which will be faster in general case. Also what is considered “small” and “large” sets of items entirely depends on the system used to run the code.

1 Like

Thanks, guys, for your opinion

it's database cache, I plan store all DB records in memory, it's estimated 1k rows with table fields as vector members.

the goal - make fast records access by id or any other string field match filtering pattern.
so maybe HashMap is really better choice for these needs, because I build this index on app launch, then work with it (e.g. update) then save to DB on app close.