JSON parser grammar

Waldemar Horwat waldemar at google.com
Fri Jun 12 15:15:56 PDT 2009


John Cowan wrote:
> Waldemar Horwat scripsit:
> 
>> I don't like the idea of having valid native ES strings that cannot
>> be serialized.  The sensible thing to do is to just escape surrogates,
>> whether they are paired or not.
> 
> Unfortunately, RFC 4627 says plainly in section 3:
> 
> 	JSON text SHALL be encoded in Unicode.
> 
> The cited version is Unicode 4.1.  As of Unicode 4.0, UTF-* documents
> are ill-formed if they contain unpaired surrogates; only the codepoints
> U+0000 to U+D7FF and U+E000 to U+10FFFF are encodable.  The fact that
> the ABNF seems to allow U+D800 to U+DFFF is irrelevant.

The JSON RFC contradicts itself.  It gives a JavaScript regular expression that allows such characters.

    Waldemar


More information about the es5-discuss mailing list