Questions regarding ES6 Unicode regular expressions

Norbert Lindenberg ecmascript at lindenbergsoftware.com
Tue Aug 26 13:45:28 PDT 2014


On Aug 26, 2014, at 11:15 , Mathias Bynens <mathias at qiwi.be> wrote:

> On 26 Aug 2014, at 19:01, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:

>> I see one remaining issue:
>> In ES5 (and ES6): `/a-z/i`  does not match U+017F (ſ) or U+212A (K) because the ES canonicalization algorithm excludes mapping code points > 127 that toUpperCase to code points <128.
>> However, as currently spec'ed, the ES6 canonicalization algorithm for /u RegExps does not include that >127/<128 exclusion.  It maps U+017F to "S" which matches. 
>> This is probably a minor variation, from the ES5 behavior, but we should probably be sure it is a desirable and tolerable change as we presumably could also apply the >127/<128 filter to /u canonicalization.
> 
> This is a useful feature, and the explicit opt-in makes the small back-compat break acceptable IMHO.

I’d say the explicit opt-in means that there is no backwards compatibility issue.

I removed the exclusion based on input from Erik Corry on es-discuss:

https://mail.mozilla.org/pipermail/es-discuss/2012-March/021249.html
https://mail.mozilla.org/pipermail/es-discuss/2012-March/021306.html

At the March 2012 TC39 shortly after, Waldemar explained the motivation for the exclusion, but Unicode case folding was approved with the “u” flag:

https://mail.mozilla.org/pipermail/es-discuss/2012-March/021919.html

Norbert


More information about the es-discuss mailing list