proposed relationships of Secure EcmaScript, ES3.1, and ES4.
brendan at mozilla.org
Thu Feb 21 00:23:12 PST 2008
On Feb 20, 2008, at 6:10 PM, Mike Samuel wrote:
> > JSON ⊂ ADsafe ⊂ Cajita ⊂ Caja ⊂ ES3 ⊂ ES4
> People who know Unicode are dangerous ;).
Yes, we need more of you ;-).
> There's three problems according to my reading of http://
> www.ietf.org/rfc/rfc4627.txt but only the first is directly related
> to syntax:
> (1) There are JSON programs that are not valid ES programs.
> The JSON program [ "\u2028" ] where the unicode escape is replaced
> with its literal equivalent is valid according to JSON since the
> set of characters that can appear in a string unescaped is
> unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
> but ES does not allow codepoint 0x2028 or 0x2029 to appear
> unescaped in a string since they are newline characters.
I wonder if JSON should not change on this point. Is there a use-case
for unescaped line/paragraph separators in strings?
> (2) There are JSON programs that have the same text as ES programs
> but different meaning.
> ES262 says that all format control codepoints, such as 0x200C,
> should be stripped out of the program in a pre-lex phase. This is
> not consistently implemented:
> eval("'\u200c'.length") == 0 on SpiderMonkey, and 1 on most
> other interpreters
Not lately, meaning post-Firefox-2/JS1.7. Fresh js shell, same
results for Firefox 3 any beta:
js> eval("'\u200c'.length") == 0
See https://bugzilla.mozilla.org/show_bug.cgi?id=274152, where
SpiderMonkey yields to IE JScript's flouting of ECMA-262. IE set a
real-world web standard, and for the better according to people in
According to https://bugzilla.mozilla.org/show_bug.cgi?id=368516#c34,
IE does not report illegal character errors correctly, instead
treating misplaced BOMs as identifiers whose references result in
runtime ReferenceErrors (I don't know what it does with other format-
control characters that occur outside of strings and regexps).
See also the follow-on bug to tolerate mislocated BOMs, https://
bugzilla.mozilla.org/show_bug.cgi?id=368516. Ain't the copy/paste
> JSON does not strip these characters out, so they are treated as
ES4 is specifying as a bug fix to match other browsers that format-
control characters shall not be stripped; it must also, to be a real-
world web standard, specify tolerance for mislocated BOMs. Postel's
Law bites back!
So JSON and ES4 will agree on this one.
> (3) There are JSON programs that can be parsed to ES but that
> cannot be serialized back to JSON without losing track of where
> info was lost.
> JSON does not put any limits on numbers, but ES does. ES will
> treat 1e1000 as Infinity. Since JSON does not have a value
> Infinity, it is unclear how to implement toJSON(fromJSON("[1e1000]")).
JSON's grammar is nice and simple, it facilitates exhaustive testing
(Rob Sayre used Koushik Sen's jCUTE to generate all-paths tests for a
BigInts or BigNums could help in the future, but the installed base
will not have them for a while and their literal syntax, without a
pragma, will have a suffix.
This kind of edge case is unlikely to be a problem in practice,
although such "overflow" conditions recur throughout the security
exploit literature. Could JSON stand to grow support for the IEEE-754
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Es4-discuss