Re: Question about the “full Unicode in strings” strawman

John Tamplin jat at google.com
Wed Jan 25 09:54:13 PST 2012


On Wed, Jan 25, 2012 at 12:46 PM, Allen Wirfs-Brock
<allen at wirfs-brock.com>wrote:

> Arbitrary 16-bit values can be placed in a String using either
> String.fromCharCode (15.5.3.2) or the \uxxxx notation in string literals.
>  Neither of these enforce a requirement that individual String elements are
> valid Unicode code units.
>

You can't really store arbitrary 16-bit values in strings, as they will get
corrupted in some browsers.  Specifically combining marks and unpaired
surrogates are problematic, and some invalid code points get replaced with
another character.  Even if it is only text, you can't rely on the strings
not being mangled -- GWT RPC quotes different ranges of characters on
different browsers.

http://code.google.com/p/google-web-toolkit/source/browse/trunk/user/src/com/google/gwt/user/client/rpc/impl/ClientSerializationStreamWriter.java?spec=svn10146&r=10146#86

(the Android bug mentioned has been fixed long ago, but I haven't gone
through any kind of research to see how many of the broken browsers are
still in use to see if it is safe to remove).

-- 
John A. Tamplin
Software Engineer (GWT), Google
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120125/0ea9a78a/attachment-0001.html>


More information about the es-discuss mailing list