ECMAScript collation question

Norbert Lindenberg ecmascript at norbertlindenberg.com
Thu Aug 30 18:30:04 PDT 2012


I changed the subject because this question also affects the ECMAScript Language Specification.

Section 15.5.4.9, String.prototype.localeCompare (that), has said since ES3: "the function is required ... and to return 0 when comparing two strings that are considered canonically equivalent by the Unicode standard."
http://ecma-international.org/ecma-262/5.1/#sec-15.5.4.9

I assume this requirement goes back to Unicode Technical Standard #10, Unicode Collation Algorithm, whose conformance clause C1 says (and has said since 1999): "Given a well-formed Unicode Collation Element Table, a conformant implementation shall replicate the same comparisons of strings as those produced by Section 4, Main Algorithm. In particular, a conformant implementation must be able to compare any two canonical-equivalent strings as being equal, for all Unicode characters supported by that implementation."
http://unicode.org/reports/tr10/#Conformance

How can the default behavior of ICU be reconciled with this conformance clause?

I brought up the issue of collation and normalization before, but didn't get much feedback:
https://mail.mozilla.org/pipermail/es-discuss/2012-June/thread.html#23568

Thanks,
Norbert


On Aug 30, 2012, at 15:17 , Nebojša Ćirić wrote:

> Hi,
>  my implementation fails this collation test:
> 
> intl402/ch10/10.3/10.3.2_CE.js
> 
> for this pair (a+umlaut+underdot):
> 
> "ä\u0323", "a\u0323\u0308"
> 
> If I turn normalization on then test passes.
> 
> Mandatory normalization introduces higher processing cost (up to 30% slower in ICU). ICU team decided to avoid normalization for some locales where they don't expect problematic characters to occur.
> 
> My question is, do we want normalize all strings by default or not, in compare() method? I think we said no to default normalization at one of the i18n meetings, but I am not 100% sure.
> 
> -- 
> Nebojša Ćirić
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss



More information about the es-discuss mailing list