Wiki updates for String, Number and Math libraries

Norbert Lindenberg ecmascript at norbertlindenberg.com
Thu Nov 17 19:14:08 PST 2011


Hi Luke,

For String.prototype.toArray, I didn't propose different behavior. The part of my message that you omitted continues "The function should be named to clearly indicate that it returns an array of UTF-16 code units. This also allows us to offer a parallel function that returns an array of code points."

How about toCharArray, seeing that we already have String.prototype.charAt, which returns a string with a UTF-16 code unit?

As to the element type, "length 1 string with a single 16-bit unsigned value" would be clearer than (and, at least to me, different from) "character".

Thanks,
Norbert


On Nov 17, 2011, at 13:28 , Luke Hoban wrote:

>>> 1) String.prototype.reverse(), as proposed, corrupts supplementary characters.
> 
> It was agreed at the meeting yesterday that this concern is significant enough, and reverse does not have sufficiently compelling use cases, so should not be included.  
> 
>>> 2) String.prototype.toArray(), as proposed, breaks up the surrogate pairs representing supplementary characters and returns each UTF-16 code unit separately.
> 
> This behaviour is consistent with the rest of the Array.prototype functions, and we shouldn't diverge on a case by case basis.  We may separately want to consider a set of String APIs that do recognize Unicode characters instead of code units, but that would be a separate Strawman to pursue.
> 
>>> 3) String.prototype.toArray() ... fills the array with elements defined as "the character at position n in S", which according to clause 6 would mean 16-bit unsigned values. It seems there needs to be a conversion to the intended type of the elements.
> 
> The resulting array will contain those same "characters", which will each be length 1 strings with a single 16-bit unsigned value.
> 
> Luke



More information about the es-discuss mailing list