Array length property wrap-around

David-Sarah Hopwood david.hopwood at industrial-designers.co.uk
Wed Nov 19 23:30:25 PST 2008


Kent Hansen wrote:
> Hi,
> What's supposed to happen when one of the built-in methods (e.g.
> Array.prototype.push) tries to assign a value greater than 4294967295 to
> the length property?
> 
> js> a = new Array(4294967295); a.push("foo")
> 0
> 
> i.e. the length becomes 0.

This is a specification bug in the Array.prototype.push algorithm
(section 15.4.4.7), due to the ToUint32 coercion in step 2.

Suppose for the sake of argument that there were no such coercion,
or a coercion to a wider type, for example that step 2 were
something like "Let n be ToNumber(Result(1))." Then:

  Assume first that the 'this' value is a native Array object, i.e.
  the algorithm of section 15.4.5.1 ([[Put]] in ES3, [[ThrowablePut]]
  in the Kona draft) is being used to put array properties.

  In step 13 of the array put algorithm, the result of ToUint32(V),
  where V is the new array length, is compared to the result of
  ToNumber(V), and a RangeError exception is thrown if they are
  different.

  So, the behaviour would be that the Array.prototype.push
  algorithm would run up to the point at which it attempts to set
  index [4294967295] to the next argument. Since this is not a
  valid array index, it would be detected in the above-mentioned
  step 13 (because ToUint32(4294967296) = 0 whereas
  ToNumber(4294967296) = 4294967296), and a RangeError would be
  thrown. This behaviour would be entirely reasonable.

  (A case could be made that Array.prototype.push should detect
  based on the number of arguments and the initial length that
  pushing some argument is *going* to fail, and fail early with
  no effect. Few other algorithms in ES3/3.1 make this strength
  of exception guarantee, and the proposed fixes below do not
  make it. In any case, there are other ways the algorithm can
  throw an exception after side-effects have occurred.)

  Note that step 7 of the Array.prototype.push algorithm is
  redundant in the case of a native Array object, because the
  preceding [[Put]] will already have set the 'length' property
  correctly (if it needs to be set, i.e. if there are more than
  zero arguments). It is only required for the case where the
  algorithm is being used on a generic array-like object.


The ToUint32 coercion in Array.prototype.push, however, causes
a wrap-around, so that the final of n is calculated modulo 2^32
(and so are the intermediate values, which is why no RangeError
occurs). Any array index properties beyond the wrapped value should
then be deleted by the array put algorithm, due to its special case
handling of 'length', which is set by Array.prototype.push in step 7.

> This also happens if you apply push to
> non-Array instances, which don't have the length property constraint
> described in 15.4.

Yes, because the initial ToUint32 coercion also happens in that case.

> Step 5 of 15.4.4.7 says to "increase n by 1", but it does not say that n
> should wrap around to 0 if  the new value can't be represented as a
> 32-bit unsigned integer (the ToUint32 conversion is only done when
> assigning the initial value of n, in step 2).

Hopefully the above explains this. n has already wrapped by that point.

However, this is not the whole story: there is a separate
implementation bug in one or more of the implementations you
are testing with -- the one(s) that have prompt "js>".

> If I interpret the spec correctly, n should become 4294967296, and step
> 7 should then cause a RangeError when trying to assign the new length.
> In the implementations I've tried, this is not the case, and instead you
> get this interesting behavior:
> 
> js > a = new Array(4294967294); a.push("foo", "bar")
> 0
> js> a[4294967294]
> foo
> 
> Oops; the length invariant broke.

Yes, and that can only happen if there is an implementation bug in
the handling of array puts. It is possible that the implementation(s)
optimize 'push' in a way that bypasses the normal array put code;
if so then the bug could be there, otherwise it must be in the
general handling of array [[Put]]/[[ThrowablePut]].

Which implementation(s) did you test with?

> With WebKit:
> 
> jsc > a = new Array(4294967294); a.push("foo", "bar")
> 0
> jsc> a[4294967294]
> undefined
> 
> Oops; all the properties 0,1,2,...,4294967294 were wiped out.

WebKit is "correctly" conforming to the flawed ES3 specification
of Array.prototype.push as described above.

> I also question step 2 of 15.4.4.7. Why is the narrow ToUint32
> conversion necessary; shouldn't it be a variant of ToInteger?

It should either be a wider conversion, or it should be checked
to be in the range 0..2**32 - 1. A ToInteger conversion is not
correct, though. For objects that are not native arrays but are
sufficiently array-like for the semantics of 'push' to make sense,
the initial value of 'length' should represent a non-negative
integer within the contiguous range of integers that can be
represented exactly as a number value. For compatibility, we need
to also accept values that coerce via ToNumber to such an integer.

(Technically, any change in this conversion is a theoretical
incompatibility, because a program could deliberately set an
integer value of 'length' that differs from the intended length
by a multiple of 2^32. I think it's extremely unlikely that any
code would be relying on this -- we have no obligation to maintain
this degree of bug-for-bug compatibility. But the change should go
into Annex E anyway.)

> We know
> that for Array instances, the invariant on the length property holds
> anyway, so the result of the conversion will be the same; but using
> ToUint32 makes the function less generic:
> 
> js> o = { length: 4294967296 }; Array.prototype.push.call(o, "foo")
> 1
> 
> Pretending I hadn't read the spec, I would have expected the above to
> create a property named "4294967296" and increase length by 1; no 32-bit
> conversion imposed.

That's possible, but for simplicity my proposed fix below imposes the
same length limitation on all array-like objects. This matters in the
case of methods like 'map' and 'filter', which create an Array object
that is no longer than their input array-like object. Also, if this
limit were not imposed then there would be another set of checks needed
around 2**53, when integers representable as a number value are no
longer contiguous.

> Looking at the other functions in Array.prototype I see that they do
> ToUint32 on the length too, so this change might become a bit large and
> dangerous.

Yep. This bug occurs in all of the Array methods except toString and concat.

toString avoids it only by delegating to join.

concat avoids it by checking explicitly whether 'this' and each of its
arguments are native Array objects (and so it can assume that their length
is an integer in the correct range); if not then it behaves as if the
object were wrapped in a single-element array.
(The NOTE at the end of section 15.4.4.4 is potentially misleading
because this behaviour is different to any of the other array generics,
and does not allow using array-like objects as if they were native arrays.
It should probably be clarified.)


To limit the size and risk of the change, I suggest defining a
"drop-in replacement" for ToUint32 that can be used in all of the
Array methods. With that approach, it is easy to allow implementations
to increase the maximum length of arrays. (If anyone objects to that
aspect of the changes in particular, it's not necessary --
ArrayLengthLimit below could be replaced with the constant 2**32 - 1.)
Also, no renumbering of steps is necessary.


====
15.4 Array Objects

Array objects give special treatment to a certain class of property names.
Let ArrayLengthLimit be an implementation-defined constant integer
in the range 2**32 - 1 to 2**53 - 1 inclusive. A property name P
(in the form of a string value) is an array index if and only if
ToString(ToNumber(P)) is equal to P and ToNumber(P) is a nonnegative
integer less than ArrayLengthLimit. Every Array object has a 'length'
property whose value is always a nonnegative integer less than or
equal to ArrayLengthLimit. The value of the length property is greater
than the numeric value of the name of every property whose name is an
array index; whenever a property of an Array object is created or changed,
other properties are adjusted as necessary to maintain this invariant.
[existing text retained]
... inherited from its prototype.

Many algorithms in this specification can operate either on Array
objects, or on other objects that have a 'length' property and other
properties whose names are array indices. To allow for this, the
following operator is used to coerce the 'length' property of an
object to an array length:

ToArrayLength(V)

 1. Call ToNumber(V).
 2. If Result(1) is not a nonnegative integer less than or equal to
    ArrayLengthLimit, then throw a RangeError.
 3. Return Result(1).
====

Replace the following paragraph in section 15.4.2.2:

# If the argument len is a Number and ToUint32(len) is equal to len,
# then the length property of the newly constructed object is set to
# ToUint32(len). If the argument len is a Number and ToUint32(len)
# is not equal to len, a RangeError exception is thrown.

with
  If the argument len is a Number, then call ToArrayLength(len).
  If this does not result in a RangeError exception being thrown,
  set the length property of the newly constructed object to the
  result.


Change each of the following:
 - steps 13 and 18 of 15.4.4.4 (Array.prototype.concat)
 - step 5 of 15.4.4.7 (Array.prototype.push)

to "Let the new value of n be ToArrayLength(n+1)."

Change step 7 of 15.4.4.13 to "Call ToString(ToArrayLength(k+Result(3))–1)."
(Alernatively, insert a call to ToArrayLength(k+Result(3)) at step 5,
and use that result in the last two steps for the final length.)

Replace all uses of ToUint32 in sections 15.4.* with ToArrayLength.


I have checked that all of these replacements are appropriate, and
that the resulting algorithms except 'splice', 'indexOf', and
'lastIndexOf' have no remaining overflow bugs.

'splice' is patently too complicated to review when written in the
ECMA-262 algorithm notation; someone should transliterate it to a
real programming language (even ECMAScript counts here ;-) so that
it is easier to check. It looks like it can create an array longer
than its input arrays and so needs a ToArrayLength call somewhere,
but where is anyone's guess. 'indexOf' and 'lastIndexOf' incorrectly
call ToInt32 and generally need more work; I won't attempt to fix
them here.

The resulting 'concat', 'push', and 'unshift' methods can throw a
RangeError because the length of the array they create would be
greater than ArrayLengthLimit; the others (excluding 'splice') can
only throw a RangeError when they operate on an object that is not
a native array and has an invalid length property.


A couple of other changes are also needed outside the main
Array section (15.4):


In section 15.5.4.14 (String.prototype.split), replace

# 3. If limit is undefined, let lim = 23**2 - 1; else
#    let lim = ToUint32(limit).

with
  3. If limit is undefined, let lim = ArrayLengthLimit; else
     let lim = ToArrayLength(limit).


As a minimal change to section 15.3.4.3 (Function.prototype.apply),
replace:

# If argArray is either an array or an arguments object, the function
# is passed the (ToUint32(argArray.length)) arguments argArray[0],
# argArray[1], ..., argArray[ToUint32(argArray.length)–1].

with
  If argArray is either an Array object or an arguments object, then
  let n be ToArrayLength(argArray.length); if this does not throw a
  RangeError, then the function is passed the n arguments argArray[0],
  argArray[1], ..., argArray[n-1].


Alternatively, below is a more ambitious change (desirable in its
own right) that does the following:

 - make apply handle its argArray generically;
 - rewrite apply and call using the algorithm notation;
 - clarify that apply and call return the return value of the called
   function;
 - acknowledge that there may be an implementation-defined limit on
   the number of arguments to a function. (This also needs to be fixed
   in section 11.2.4.)

====
15.3.4.3 Function.prototype.apply (thisArg, argArray)

The apply method takes two arguments, thisArg and argArray, and performs
a function call using the [[Call]] property of the this object, by the
following steps:

1. If IsCallable(this) is false, then throw a TypeError exception.
2. If argArray is null or undefined, then
     a. Call the [[Call]] method of this object, providing thisArg
        as the this value and no arguments.
     b. Return Result(2a).
3. If Type(argArray) is not Object, then throw a TypeError exception.
4. Call the [[Get]] method of argArray with property name "length".
5. Let n be ToArrayLength(Result(4)).
6. If n is greater than an implementation-defined limit on the number
   of arguments to a function, then throw a RangeError exception.
7. For every nonnegative integer k less than n:
     a. Let P_k be ToString(k).
     b. Let arg_k be the result of calling the [[Get]] method of
        argArray with property name P_k.
8. Call the [[Call]] method of this object, providing thisArg as the
   this value and arg_0, arg_1, ... arg_{n-1} as the arguments.
9. Return Result(8).

The length property of the apply method is 2.


15.3.4.4 Function.prototype.call (thisArg [ , arg_0 [ , arg_1, ... ] ] )

The call method takes one or more arguments, thisArg and (optionally)
arg_0, arg_1 etc., and performs a function call using the [[Call]]
property of the object, by the following steps:

1. If IsCallable(this) is false, then throw a TypeError exception.
2. Call the [[Call]] method of this object, providing thisArg as the
   this value, and any remaining arguments to the call method starting
   with arg_0 (if provided) as the arguments.
3. Return Result(2).

The length property of the call method is 1.
====


> Still, the undefined behavior /  data loss irks me.

The behaviour is defined, just wrong (*and* array puts are apparently
misimplemented).

-- 
David-Sarah Hopwood



More information about the Es-discuss mailing list