Internationalization: Normalization and canonical equivalence in string comparison
Norbert Lindenberg
ecmascript at norbertlindenberg.com
Tue Jun 19 07:45:18 PDT 2012
The test is at
http://norbertlindenberg.com/ecmascript/ESTest.html (and .js).
The strings I used are:
["o\u0308", "ö"],
["ä\u0323", "a\u0323\u0308"], // requires reordering
["a\u0308\u0323", "a\u0323\u0308"], // requires reordering
["ạ\u0308", "a\u0323\u0308"],
["ä\u0306", "a\u0308\u0306"],
["ă\u0308", "a\u0306\u0308"],
["\u1111\u1171\u11b6", "퓛"], // jamo/hangul
["Å", "Å"]
Results:
Safari on Mac, iOS: Fail for comparisons that require reordering nonspacing marks within strings; pass for others.
Firefox, Opera, Explorer on Windows: Fail for jamo/hangul comparison; pass for others.
Firefox, Node on Mac; Chrome on Mac, Windows: Fail for all.
Opera on Mac: Passes for all.
Norbert
On Jun 19, 2012, at 7:30 , Rick Waldron wrote:
>
>
> On Tue, Jun 19, 2012 at 12:36 AM, Norbert Lindenberg <ecmascript at norbertlindenberg.com> wrote:
> The ECMAScript Internationalization API Specification currently has normalization as an optional feature in collation. However, it requires that the compare function "return 0 when comparing Strings that are considered canonically equivalent by the Unicode standard". Canonical equivalence, I thought, is usually implemented through normalization. Does it make sense to keep normalization as a separate and optional feature then? Is anybody planning to implement canonical equivalence through other mechanisms, such that the lack of normalization would be visible in the comparison of non-equivalent strings?
>
> BTW, the requirement that canonically equivalent strings compare as equal has been part of the specification of String.prototype.localeCompare since ES3. When testing with a handful of string pairs pulled from chapter 3 of the Unicode Standard and from UTS 10, however, I found that only Opera on the Mac detects their equivalence correctly. Firefox on the Mac and the V8 systems (Chrome, Node) fail to detect any equivalence; Safari, Explorer and the Windows versions of Opera and Firefox detect some and miss others. Obviously people haven't been paying much attention to localeCompare...
>
>
> I don't know enough about the first part of your message to be any use; I am, however, interested in the second part - will you be publishing your tests and findings?
>
> Rick
>
>
>
>
> Norbert
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
More information about the es-discuss
mailing list