Wiki updates for String, Number and Math libraries

Norbert Lindenberg ecmascript at norbertlindenberg.com
Thu Nov 17 18:50:37 PST 2011


Fortunately we dropped reverse in the TC 39 meeting yesterday - nobody had an idea who would use it.

I brought up combining character sequences as a concern for the other proposed functions (startsWith etc.). There the majority opinion was that the model of the existing String functions, ignoring the semantics of Unicode characters, should be followed. It was suggested that eventually there should be a parallel set of Unicode aware functions - Mark Miller suggested "WString".

Norbert


On Nov 17, 2011, at 18:35 , Gillam, Richard wrote:

>> 1) String.prototype.reverse(), as proposed, corrupts supplementary characters. Clause 6 of Ecma-262 redefines the word "character" as "a 16-bit unsigned value used to represent a single 16-bit unit of text", that is, a UTF-16 code unit. In contrast, the phrase "Unicode character" is used for Unicode code points. For reverse(), this means that the proposed spec will reverse the sequence of the two UTF-16 code units representing a supplementary character, resulting in corruption. If this function is really needed (is it? for what?), it should preserve the order of surrogate pairs, as does java.lang.StringBuilder.reverse:
>> http://download.oracle.com/javase/7/docs/api/java/lang/StringBuilder.html#reverse()
> 
> It's actually worse than this: it'll also reverse the order of combining character sequences, causing any combining characters to attach to a different base character than they did in the original string: a-accent-e, when "accent" is a combining accent, means the accent is on the a; reversing the string would put the accent on the e.
> 
> --Rich Gillam
> 



More information about the es-discuss mailing list