Questions regarding ES6 Unicode regular expressions
ecmascript at lindenbergsoftware.com
Tue Aug 26 13:45:28 PDT 2014
On Aug 26, 2014, at 11:15 , Mathias Bynens <mathias at qiwi.be> wrote:
> On 26 Aug 2014, at 19:01, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:
>> I see one remaining issue:
>> In ES5 (and ES6): `/a-z/i` does not match U+017F (ſ) or U+212A (K) because the ES canonicalization algorithm excludes mapping code points > 127 that toUpperCase to code points <128.
>> However, as currently spec'ed, the ES6 canonicalization algorithm for /u RegExps does not include that >127/<128 exclusion. It maps U+017F to "S" which matches.
>> This is probably a minor variation, from the ES5 behavior, but we should probably be sure it is a desirable and tolerable change as we presumably could also apply the >127/<128 filter to /u canonicalization.
> This is a useful feature, and the explicit opt-in makes the small back-compat break acceptable IMHO.
I’d say the explicit opt-in means that there is no backwards compatibility issue.
I removed the exclusion based on input from Erik Corry on es-discuss:
At the March 2012 TC39 shortly after, Waldemar explained the motivation for the exclusion, but Unicode case folding was approved with the “u” flag:
More information about the es-discuss