Full Unicode strings strawman
Mark Davis ☕
mark at macchiato.com
Mon May 16 15:36:01 PDT 2011
*— Il meglio è l’inimico del bene —*
On Mon, May 16, 2011 at 15:27, Allen Wirfs-Brock <allen at wirfs-brock.com>wrote:
> See the section of the proposal about String.prototype.charCodeAt
> On May 16, 2011, at 2:20 PM, Mike Samuel wrote:
> > Allen, could you clarify something.
> > When the strawman says without mentioning "codepoint"
> > "The String type is the set of all finite ordered sequences of zero or
> > more 16-bit\b\b\b\b\b\b 21-bit unsigned integer values (“elements”)."
> > does that mean that String.charCodeAt(...) can return any value in the
> > range [0, 1 << 21)?
> > When the strawman says using "codepoint"
> > "SourceCharacter ::
> > any Unicode codepoint"
> > that excludes the blocks reserved for surrogates?
> Does the Unicode spec. refer to those surrogate codes as "codepoints"? My
> understanding is that it does not, but I could be wrong. My intent is that
> the answer is no.
Yes, it does. See my message, with a pointer to the Unicode glossary.
> Note that this section is defining the input alphabet of the grammar that .
> It has nothing to do this the actual character encodings used for source
> programs. The production essentially says that the input alphabet of
> ECMAScript is all defined Unicode characters.
> all defined Unicode characters.
That would also not be correct. The defined *characters* are only about 109K
(more if you consider private use); nowhere near the number of code points,
because there are over 800K code points that are reserved for the allocation
of *future* characters. For a breakdown, see
Sorry to seem picky, but we have found over time that you have to be very
careful about the use of terms. The term "character" is especially fraught
> The actual encoding of source programs (bother external and internal) is
> up to the implementation and the host environment. (the string input to
> eval is an exception to this).
> es-discuss mailing list
> es-discuss at mozilla.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the es-discuss