JSON parser grammar

John Cowan cowan at ccil.org
Fri Jun 5 17:09:53 PDT 2009


David-Sarah Hopwood scripsit:

> This has the side-effect that an ES5 string with unpaired surrogates will
> be encoded as JSON with unpaired surrogates -- garbage-in, garbage-out.
> I'm not sure it's worth requiring an error here.

The trouble is that, as we discussed earlier, ES strings aren't necessarily
Unicode (they can be a random sequence of uint16's), whereas JSON documents
*are* necessarily Unicode.  Since the JSON encoder has to process surrogates
anyhow, it might as well check for unpaired ones and barf.

> Another character I wasn't sure about is U+202F NARROW NO-BREAK SPACE.
> It is harmless to escape it, but it isn't in the Jacaranda (or Cajita, IIRC)
> lists of known characters that are not accepted in JavaScript source by
> some JS implementations, and it isn't a format-control character.

Well, whatever.  Extending the list is always harmless.  Hopefully non-JS
JSON decoders have no such glitches, though it might be worthwhile trying
to find out.

-- 
John Cowan        http://www.ccil.org/~cowan          cowan at ccil.org
Please leave your values                Check your assumptions.  In fact,
   at the front desk.                      check your assumptions at the door.
     --sign in Paris hotel                   --Cordelia Vorkosigan


More information about the es5-discuss mailing list