JSON parser grammar
david-sarah at jacaranda.org
Fri Jun 5 16:24:54 PDT 2009
John Cowan wrote:
> David-Sarah Hopwood scripsit:
>> var escapable =
>> Incidentally, the format-control characters \u06FF and \u200B and the
>> the noncharacters \uFDD0..FDEF should probably be added to the above
>> set of escapable characters.
> Agree on U+200B and the non-characters. U+06FF is a letter,
> ARABIC LETTER HEH WITH INVERTED V; I assume you don't mean that one.
Typo. I meant U+06DD ARABIC END OF AYAH.
>> (The surrogate codes should not be added, since JSON is not tied to UTF-16.)
> Presumably, ES5 strings with unpaired surrogates cannot be JSONized, since
> unpaired surrogates don't represent Unicode characters.
Right, but the main reason not to escape surrogate codes is that it
would be wrong for correctly paired surrogates, which *can* be JSONized.
This has the side-effect that an ES5 string with unpaired surrogates will
be encoded as JSON with unpaired surrogates -- garbage-in, garbage-out.
I'm not sure it's worth requiring an error here.
>> Note that \u0604 is unassigned, and probably doesn't need to be in this set.
Another character I wasn't sure about is U+202F NARROW NO-BREAK SPACE.
It is harmless to escape it, but it isn't in the Jacaranda (or Cajita, IIRC)
some JS implementations, and it isn't a format-control character.
David-Sarah Hopwood ⚥ http://davidsarah.livejournal.com
More information about the es5-discuss