I will have probably less 1000 objects that I'm hashing, I just need them not to collide.
I guess that this depends on the hasher, so assuming I'm using fnv, do you think that I should be safe? I feel like I should be more than safe, but I just wanted to make sure.
Yeah, I thought of that, but that would require a registration step to make sure they're all unique, and then state to store the ID. Which is possible, but less convenient in my scenario.
So I was trying to find the best way to fit a hash into a JS number.
There are also From implementations converting all the integer types to a js_sys::BigInt, so it should be pretty easy to convert your hash into a bigint.
Even with a perfect hashing algorithm, collisions are a matter of probability. Before answering that question, you would need to decide which probability of a collision you are willing to accept.
For the probability to be practically zero (for any practical number of stored objects and any repetition count of the procedure), you should use at least 160 bit hashes (better 256 or 512 bits). Another option is to cope with collisions and handle them properly.
See Birthday_attack#mathematics on Wikipedia. According to that table, an (ideal) 32 bit hash would collide with a probability of 0.1% if 2900 elements are inserted. Thus in one of thousand runs you would have a collision.
Note that the more often you run the program (with different input), the higher will be the chance that a collision happens during one of those runs. If your program is used by 1000 people in 1000 scenarios, then the probability that in one case there is a collision will be 1000000 times higher than calculated. Hence I would be very cautious with short hash lengths and either
The hash is derived from the asset filepath, so the value doesn't change between users, which helps cut down on the number of iterations.
I'm a little confused, though, how a HashMap can get away with 64 bit hash, and a UUID can get away with only 128 bits, if 160 bits is required to avoid collisions.
Either way, I don't think I need more than 64 bits, because even if I used more bits, I'd still be in just as much trouble, because Bevy uses 64 bit hashes for it's handles.
Unfortunately, it seems that serde_v8 doesn't support deserializing from BigInts, so that might not be an option for me either.
I think for now I'll just use a bas64 encoded string to represent the hash in JS, that way I'll be as collision resistant as Bevy, and hopefully I don't suffer too much from the string conversion cost. If profiling reveals it to be an issue, I'll re-visit the issue.
HashMap tolerates collisions. It never assumes two keys are equal based on hash; it always compares with Eq. If entries in the map have the same hash, then they are stored in the same "bucket", and a lookup is a linear search of that bucket. This is why HashMap keys must implement Eq as well as Hash.
This is what the Hash trait is meant for. If it were meant for collision-free hashing, its output would be bigger than 64 bits.