Full Unicode strings strawman

Allen Wirfs-Brock allen at wirfs-brock.com
Mon May 16 16:13:34 PDT 2011


On May 16, 2011, at 3:22 PM, Shawn Steele wrote:

> The problem is that “\UD800\UDC00” === “\U+010000”.  And if the internal representation is UTF-32, then they’d have to continue to be the same.  And it’s really hard for them to have the same length if one’s 2 code points and the other’s 1 code point.

Not in my proposal!  "\ud800\udc00"=== "\u+010000"  is false in my proposal.  One has length 2 and one has length.  You are confusing the logical interpretation of a UTF-16 encoded character sequence with the actual character encoding .  To get an equality in the context of my proposal you would have to say something like:

   UTF16Decode("\ud800\udc00") === "\u+010000"
or
   "\ud800\udc00" === UTF16Encode("u+010000")

Allen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20110516/0e44b344/attachment.html>


More information about the es-discuss mailing list