So performance would increase since hashing is faster than binary-searching. How...

mandarax8 · on April 14, 2024

> Atomic operations are limited to 32-bits.

I'm using 64bit atomics at work, are you on an old version of cuda or are some operations only supported on 32bit?

tspeterkim · on April 14, 2024

I misspoke. (got confused with the key limit in my link above)

Atomics work up to 128-bits (https://docs.nvidia.com/cuda/cuda-c-programming-guide/#atomi...).

Regardless, it's still less than 100 bytes, which is the max length of city strings.

Sesse__ · on April 14, 2024

If the set of cities is known (as your binary search algorithm assumes), you can insert all of them first, on the CPU. That will resolve all collisions ahead of time, making the structure of the hash table essentially read-only. (Of course, you would still need atomics for the values.)

convivialdingo · on April 14, 2024

You could hash the city names using something like a djb2a algorithm?