Re: Question about the “full Unicode in strings” strawman
Mark S. Miller
erights at google.com
Tue Jan 24 14:11:51 PST 2012
On Tue, Jan 24, 2012 at 12:33 PM, Allen Wirfs-Brock
<allen at wirfs-brock.com>wrote:
> Note that this proposal isn't currently under consideration for inclusion
> in ES.next, but the answer to you question is below
> Just as the current definition of string specifies that a String is a
> sequence of 16-bit unsigned integer values, the proposal would specify that
> a String is a sequence of 32-bit unsigned integer values. In neither cause
> is it required that the individual String elements must be valid Unicode
> code point or code units. 8 hex digits are required to express a the full
> range of unsigned 32-bit integers.
Why 32? Unicode has only 21 bits of significance. Since we don't expect
strings to be stored naively (taking up 4x the space that would otherwise
be allocated), I don't see the payoff from choosing the next power of 2.
The other choices I see are a) 21 bits, b) 53 bits, or c) unbounded.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the es-discuss