[rust-dev] hashmap benchmark
niko at alum.mit.edu
Sun Apr 8 14:21:44 PDT 2012
On 4/8/12 12:15 PM, Stefan Plantikow wrote:
> I was thinking about this, too. One of the state-of-the art algorithms seems to be hopscotch hashing, wikipedia has a quite good introduction to it. Even though it has been developed for concurrent access, it should also be quite good in a single core scenario and has a really low memory footprint (90% full hash table still works reasonably). I was thinking about implementing that for fun once the currently ongoing changes to regions and vectors are complete.
I am totally excited about offering a variety of map abstractions. One
additional thing I would particularly like (which is kind of orthogonal)
is a default map implementation that begins as a straight-up list of
some fixed size and then shifts to another algorithm as the table is
populated. Of course we'd want to test and tune to see where it makes
sense to shift, but I believe that a lot of hash tables are generally
small (but not always) and thus can benefit from changing strategies as
they fill up. In fact I would like to have a similar approach to each
of the basic container types (a default implementation that adjusts and
does the right thing, plus a variety of more specific implementations).
I also think persistent collections would be very useful.
> Hashing algorithms (hopscotch, bloom filters) could greatly benefit from having access to the llvm bit manipulation intrinsics (ctpop, ctlz, cttz, bswap).
I was going to say that we should just build them into the compiler as
intrinsics (after Marijn's work on intrinsics, this should be fairly
straightforward). But Brian's e-mail also looks pretty nice, actually.
In general I'd rather avoid introducing arbitrary LLVM assembly---the
further we can avoid exposing our use of LLVM, the better imo---but
packaging things as "native" functions helps to keep us relatively
portable should we move away from LLVM in the future.
More information about the Rust-dev