Unicode normalization

Norbert Lindenberg ecmascript at norbertlindenberg.com
Tue May 29 17:44:49 PDT 2012


This is for the Language Specification, not the Internationalization API Specification.

The assumptions are in the Language Specification, so they have to be addressed there.

A normalization API can live in the Language Specification or in the Internationalization API. If we keep it simple (as this one function), then I think it can easily be added to String.prototype. More fine-grained functionality (like in ICU) would have to go into the Internationalization API (v2). The two are not mutually exclusive.

Norbert


On May 29, 2012, at 17:36 , Mark Davis ☕ wrote:

> This is for v2, right?
> 
> Mark
> 
> — Il meglio è l’inimico del bene —
> 
> 
> 
> On Tue, May 29, 2012 at 5:34 PM, Norbert Lindenberg <ecmascript at norbertlindenberg.com> wrote:
> The ECMAScript Language Specification 5.1 makes assumptions about source text being in Unicode normalization form C (NFC), but doesn't say anything that would actually make it so. Implementations, as far as I can tell, have also chosen to just "assume". This is partially based on the Character Model for the World Wide Web: Normalization, which recommends early normalization to NFC, but never became a standard.
> 
> I'm proposing to correct this by
> - removing the invalid assumptions from the specification,
> - add a normalization function so that applications can normalize text where needed.
> 
> http://wiki.ecmascript.org/doku.php?id=strawman:unicode_normalization
> 
> Comments?
> 
> Regards,
> Norbert
> 
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
> 



More information about the es-discuss mailing list