for-in evaluation order

Dmitry A. Soshnikov dmitry.soshnikov at gmail.com
Mon Dec 27 12:02:47 PST 2010


On 27.12.2010 20:54, Wes Garland wrote:
> On Mon, Dec 27, 2010 at 8:04 AM, Dmitry A. Soshnikov 
> <dmitry.soshnikov at gmail.com <mailto:dmitry.soshnikov at gmail.com>> wrote:
>
>     Can you also show a memory representation (at lower level) of
>     "dense" and "sparse" (?) data. It will help to understand the
>     advantages of choosing the needed order by implementers. It's not
>     required to show it right now, but, if possible -- when you'll
>     have a time to show it.
>
>
> I haven't looked at this code in SpiderMonkey for a very long time, 
> but IIRC, here are the highlights:
>
>     * All ECMAScript values are stored in a type called "jsval", which
>       is basically a tagged variant type
>     * jsvals can represent integers, doubles (firefox 4 - older stores
>       double pointers), string pointers, object pointers, and bools
>     * Firefox 4 uses 64-bit jsvals and a bit more specialization
>       information in the tags; older uses 32-bit jsvals
>     * Key fact: regular arrays are basically just objects, but dense
>       arrays have a highly specialized implementation as of firefox 3
>     * dense arrays are arrays with few holes and numeric indices
>     * dense arrays are represented internally essentially as C arrays
>       of jsval with matching indices. So a lookup is very very cheap.
>       ("matching indices" might be a slight lie, I think there might
>       be an offset for arrays big leading holes etc)
>     * regular object properties are stored in "slots", which are
>       really just elements in a C array of jsvals
>     * there is a map which allows us to look up which slot a property
>       is stored in
>     * this map (and the slots) are populated in first-write
>       (declaration) order
>     * I believe this implementation is the origin of the defacto
>       enumeration order relied on by the web; the enumerator simply
>       traverses the map in storage order
>

Thanks a lot, Wes, indeed. It's very useful.

> So you see, looking up an index in a dense array is really cheap - you 
> take the value of the index, convert it to a C int (usually just a bit 
> shift), multiply it by the size of the jsval, add that to the pointer 
> to the start of the storage, and voila - you have a pointer to the 
> jsval which represents the value stored.

Yep, I supposed a similar optimization with a constant offset (for the 
price of enumeration order).

So (to repeat Andreas' example for the complete understanding):

a = [] // created a pointer to a low-level C-array with constant offset 
for numeric indices

// added values to real C-array indicies
a[2] = 1
a[1] = 1
a[0] = 1

// rebuilding a C-array to a hash-table (slots -- C-array with jsvalues)
a["x"] = 1

for (k in a) // 0, 1, 2, x -- (seems, an effect of rebuilding).

But:

a = [] // C-array

a["x"] = 1 // rebuild to hash-table

// and then the order is already direct
// since anyway there's no optimization in respect of C-array
a[2] = 1
a[1] = 1
a[0] = 1

for (k in a) // x, 2, 1, 0

Right?

> You do not have the overhead of traversing the map to determine the 
> offset, nor the overhead of *storing* the map.
>

Yes, I see.

> Andreas mentioned the possibility of an object representation which 
> has both a map/slots and a dense array component - this is 
> interesting, as it makes representing user-defined array-like objects 
> more efficient

Wait, the last case with first adding a["x"] (and then [2], [1], [0]) 
makes some thirst type of optimized arrays?

> --- I believe that adding a length property in the current 
> implementation forces a non-dense representation of the array.
>

Why? Until there are only index-properties we can change the `length` up 
and down. On the other hand, the `length` then should be stored 
somewhere else.

> I think Mark's suggestion here is very sensible.  My gut tells me that 
> it has a low probability of breaking the web, yet it offers room for 
> implementers to provide a fast path for numeric object indices while 
> offering an opportunity to codify standard practice.
>

Yes, if on the other hand fast arrays with direct constant O(1) offset, 
I see now that the priority of the enumeration order in this respect can 
be decreased.

Thanks again for clarifications,

Dmitry.


> Wes
>
> -- 
> Wesley W. Garland
> Director, Product Development
> PageMail, Inc.
> +1 613 542 2787 x 102

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20101227/d9791a8e/attachment.html>


More information about the es-discuss mailing list