New full Unicode for ES6 idea
wes at page.ca
Mon Feb 20 04:19:16 PST 2012
On 20 February 2012 00:45, Allen Wirfs-Brock <allen at wirfs-brock.com> wrote:
> 2) Allow invalid unicode characters in strings, and preserve them over
> concatenation – ("\uD800" + "\uDC00").length == 2.
> I think 2) is the only reasonable alternative.
I think so, too -- especially as any sequence of Unicode code points --
including invalid and reserved code points -- constitutes a valid Unicode
string, according to my recollection of the Unicode specification.
In addition to the reasons you listed, it should also be noted that
- 2) is cheaper to implement
- 2) keeps more old code working; ignoring the examples where developers
use String as uint16, there are also the cases where developers scan
strings for 0xD800. 0xD800 is a reserved code point.
I don't think 1) would be a very good choice, if for no other reason the
> set of valid unicode characters is a moving target that you wouldn't want
> to hardwire into either the ES specification or implementations.
To play the devil's advocate, I could point out that the spec language
could say something about reserved code points. Those code points are
reserved because, IIRC, they are not representable in UTF-16; they include
the ranges for the surrogate pairs.
Wesley W. Garland
Director, Product Development
+1 613 542 2787 x 102
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the es-discuss