Re: Question about the “full Unicode in strings” strawman

Mark S. Miller erights at
Tue Jan 24 14:11:51 PST 2012

On Tue, Jan 24, 2012 at 12:33 PM, Allen Wirfs-Brock
<allen at>wrote:

> Note that this proposal isn't currently under consideration for inclusion
> in, but the answer to you question is below

> Just as the current definition of string specifies that a String is a
> sequence of 16-bit unsigned integer values, the proposal would specify that
> a String is a sequence of 32-bit unsigned integer values.  In neither cause
> is it required that the individual String elements must be valid Unicode
> code point or code units. 8 hex digits are required to express a the full
> range of unsigned 32-bit integers.

Why 32? Unicode has only 21 bits of significance. Since we don't expect
strings to be stored naively (taking up 4x the space that would otherwise
be allocated), I don't see the payoff from choosing the next power of 2.
The other choices I see are a) 21 bits, b) 53 bits, or c) unbounded.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the es-discuss mailing list