Flexible String Representation - full Unicode for ES6?

Erik Arvidsson erik.arvidsson at gmail.com
Fri Dec 21 21:09:47 PST 2012


On Fri, Dec 21, 2012 at 6:45 PM, Chris Angelico <rosuav at gmail.com> wrote:

> There is an alternative. Python (as of version 3.3) has implemented a
> new Flexible String Representation, aka PEP-393; the same has existed
> in Pike for some time. A string is stored in memory with a fixed
> number of bytes per character, based on the highest codepoint in that
> string - if there are any non-BMP characters, 4 bytes; if any
> U+0100-U+FFFF, 2 bytes; otherwise 1 byte. This depends on strings
> being immutable (otherwise there'd be an annoying string-copy
> operation when a too-large character gets put in), which is true of
> ECMAScript. Effectively, all strings are stored in UCS-4/UTF-32, but
> with the leading 0 bytes elided when they're not needed.

This is how most VMs already work.

I agree with you that it would be a better world if this was the case
but I don't hear you suggesting how we might be able to change this
without breaking the web?

--
erik


More information about the es-discuss mailing list