Full Unicode strings strawman

Mike Samuel mikesamuel at gmail.com
Mon May 16 18:51:18 PDT 2011


2011/5/16 Allen Wirfs-Brock <allen at wirfs-brock.com>:
> It the string is written as   \ud800\udc00\u0061" the 'a' will be at offset
> 1, even in the new proposal.  It would only be at offset 1 if it was written
> as "\u+010000\u+000061"  (using the literal notation from the proposal).

Under this scheme,

     eval('  "\\uD834\\uDD1E"  ')  !== JSON.parse('  "\\uD834\\uDD1E"  ')

>From RFC 4627
"""
   To escape an extended character that is not in the Basic Multilingual
   Plane, the character is represented as a twelve-character sequence,
   encoding the UTF-16 surrogate pair.  So, for example, a string
   containing only the G clef character (U+1D11E) may be represented as
   "\uD834\uDD1E".
"""


More information about the es-discuss mailing list