Deterministic hash table benchmarks

Jason Orendorff jason.orendorff at gmail.com
Fri Feb 24 15:56:47 PST 2012


On Fri, Feb 24, 2012 at 2:16 PM, David Bruant <bruant.d at gmail.com> wrote:
> "For any large enough number of entries N, a CloseTable with N entries
> allocates either 1.0625x or 2.125x as much memory"
> => Based on your diagram, it seems that it's more often 1.0625 than 2.125.

Yes, it's more often 1.0625.

> However, I'm not sure I understand any of these number. As you point
> out, "CloseTable entries are 50% larger", so I would expect that for a
> large number of entries, CloseTables would be 1.5x bigger than
> OpenTables (maybe more due to the additional pointer table).
> What explains the 1.0625?

An OpenTable always allocates a single power-of-2-sized chunk of memory.

A CloseTable makes two allocations: a power-of-2-sized chunk for the
data, and a smaller one (1/16 the size on 32-bit platforms, 1/8 the
size on 64-bit platforms) for the hash table. So the Close table can
be 1+1/16 as big as an OpenTable, and that's 1.0625 (or 1+1/8=1.125 on
32-bit platforms).

Both tables double in size as the number of entries increase. An
OpenTable doubles each time the load factor reaches 0.75; a CloseTable
doubles when it's completely full. It turns out this causes the
CloseTable to double earlier, and that is why the factor is sometimes
2×1.0625=2.125 (or 2.25 on 64-bit platforms).

I hope this is clear.

-j


More information about the es-discuss mailing list