JSON parser grammar

Tyler Close tyler.close at gmail.com
Tue Jun 9 11:20:57 PDT 2009


Does it make sense to specify the character escaping rules as a
whitelist of Unicode general categories that don't need to be escaped;
leaving all else to be escaped? Seems safer than a blacklist of
specific Unicode characters to escape.

--Tyler

On Mon, Jun 8, 2009 at 6:56 PM, David-Sarah
Hopwood<david-sarah at jacaranda.org> wrote:
> Douglas Crockford wrote:
>> David-Sarah Hopwood wrote:
>>> U+06DD and U+200B (format-control), and U+FDD0..FDEF (noncharacters)
>>> should be added to the list. The rationale for adding them is the same as for
>>> the other format-control characters and noncharacters.
>>
>> The rationale for the set the json2 escapes is that those characters are
>> mishandled by one or more browsers. That rationale does not hold for
>> your set.
>
> ES3 specified that all format-control characters should be stripped.
> Therefore, format-control characters cannot interoperably be passed to
> 'eval'. If some browsers failed to strip U+06DD and U+200B, then that
> was a bug.
>
> Noncharacters similarly cannot interoperably be passed to 'eval' because
> th Unicode standard specifies that they shall not be treated as characters.
> It is therefore noncompliant to Unicode to accept them in source code.
> (Arguably, this means that they should be explicitly prohibited in ES5.)
>
> As John Cowan pointed out, it is quite harmless for the set of escaped
> code units to be conservative; none of these are commonly used characters.
>
> --
> David-Sarah Hopwood  ⚥  http://davidsarah.livejournal.com
>
> _______________________________________________
> es5-discuss mailing list
> es5-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es5-discuss



-- 
"Waterken News: Capability security on the Web"
http://waterken.sourceforge.net/recent.html


More information about the es5-discuss mailing list