Full Unicode strings strawman

Shawn Steele Shawn.Steele at microsoft.com
Mon May 16 17:28:59 PDT 2011


> I think you'll find that the actual JS engines are currently UCS-2 centric.
> The surrounding browser environments are doing the UTF-16 interpretation.
> That why you see 𐀀 instead of �� in browser generated display output.

There’s no difference.  I wouldn’t call Windows C++ WCHAR “UCS-2”, however if wc[0] = 0xD800 and wc[1] = 0xDC00, it’s going to act like a there’s a U+10000 character at wc[0], and wc[1] is going to be meaningless.  Which is exactly how JavaScript behaves today.  The JavaScript engines don’t care if it’s UCS-2 or UTF-16 because they aren’t doing anything meaningful with the difference, except not supporting native recognition of code points > 0x10000.

-Shawn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20110517/989d9202/attachment.html>


More information about the es-discuss mailing list