(no subject)

Lars T Hansen lth at acm.org
Fri Mar 16 02:23:41 PDT 2007


On 3/15/07, Neil Mix <nmix at pandora.com> wrote:
> I was just reading through the hashcodes proposal, I see that the
> proposal allows for the (remote) possibility of collisions between keys.
>
> What jumps out at me is the potential for developers to assume that a
> hashcode is the same as a unique ID (especially if some
> implementations enforce uniqueness while others don't), thus leading
> to subtle bugs.  For that reason, I'd like to weigh in with the
> opinion that requiring uniqueness for hashcodes would be really
> nice.  (Alternatively a separate mechanism for generating unique
> object IDs would be nice.)
>
> (I'm sure there are good reasons why enforcing uniqueness amongst
> keys is a burden on implementers, although I'm unaware of what those
> restrictions are.  Just a layman's perspective.)

The Java spec explicitly states that Object.hashCode does not return
unique values:

"It is not required that if two objects are unequal according to the
equals(java.lang.Object) method, then calling the hashCode method on
each of the two objects must produce distinct integer results."

The ES4 draft implies the same behavior, but couches it in more
difficult language (it tries to legislate what implementations ought
do without having the right to do so).

If an implementation is going to guarantee uniqueness then it probably
needs to maintain a global table of all live objects whose hash codes
have been obtained, to avoid generating duplicate codes when the
global hash code counter wraps around (intrinsic::hashcode returns a
uint).  True object IDs also preclude using a popular technique in the
presence of a moving collector, taking the address of the object at
the time the hash code was obtained and recording that in the object
if it survives a garbage collection.

IMO we'd be better off following Java here and just spell out (more)
clearly that you can't use the hash code as an object ID.  The purpose
of intrinsic::hashcode is after all to allow fast object-identity
hashing.  The use case for object IDs seems less clear.

--lars



More information about the Es4-discuss mailing list