Full Unicode based on UTF-16 proposal

Glenn Adams glenn at skynav.com
Mon Mar 26 15:10:44 PDT 2012


On Mon, Mar 26, 2012 at 2:02 PM, Gavin Barraclough <barraclough at apple.com>wrote:

> I really like the direction you're going in, but have one minor concern
> relating to regular expressions.
>
> In your proposal, you currently state:
>        "A code unit that is in the range 0xD800 to 0xDFFF, but is not part
> of a surrogate pair, is interpreted as a code point with the same value."
>

Just as a reminder, this would be in explicit violation of the Unicode
conformance clause C1 unless it can be guaranteed that such a code point
will not be interpreted as an abstract character:

C1 A process shall not interpret a high-surrogate code point or a
low-surrogate code point as an abstract character.

[1] http://www.unicode.org/versions/Unicode6.1.0/ch03.pdf

Given that such guarantee is likely impractical, this presents a problem
for the above proposed language.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20120326/6a2d6dc1/attachment.html>


More information about the es-discuss mailing list