proposed relationships of Secure EcmaScript, ES3.1, and ES4.

Brendan Eich brendan at
Thu Feb 21 00:23:12 PST 2008

On Feb 20, 2008, at 6:10 PM, Mike Samuel wrote:

> >     JSON ⊂ ADsafe ⊂ Cajita ⊂ Caja ⊂ ES3 ⊂ ES4
> People who know Unicode are dangerous ;).

Yes, we need more of you ;-).

> There's three problems according to my reading of http:// 
> but only the first is directly related  
> to syntax:
> (1) There are JSON programs that are not valid ES programs.
> The JSON program [ "\u2028" ] where the unicode escape is replaced  
> with its literal equivalent is valid according to JSON since the  
> set of characters that can appear in a string unescaped is
> unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
> but ES does not allow codepoint 0x2028 or 0x2029 to appear  
> unescaped in a string since they are newline characters.

I wonder if JSON should not change on this point. Is there a use-case  
for unescaped line/paragraph separators in strings?

> (2) There are JSON programs that have the same text as ES programs  
> but different meaning.
> ES262 says that all format control codepoints, such as 0x200C,  
> should be stripped out of the program in a pre-lex phase.  This is  
> not consistently implemented:
>     eval("'\u200c'.length") == 0 on SpiderMonkey, and 1 on most  
> other interpreters

Not lately, meaning post-Firefox-2/JS1.7. Fresh js shell, same  
results for Firefox 3 any beta:

js> eval("'\u200c'.length") == 0
js> eval("'\u200c'.length")

See, where  
SpiderMonkey yields to IE JScript's flouting of ECMA-262. IE set a  
real-world web standard, and for the better according to people in  
certain locales.

According to,  
IE does not report illegal character errors correctly, instead  
treating misplaced BOMs as identifiers whose references result in  
runtime ReferenceErrors (I don't know what it does with other format- 
control characters that occur outside of strings and regexps).

See also the follow-on bug to tolerate mislocated BOMs, https:// Ain't the copy/paste  
Internet grand?

> JSON does not strip these characters out, so they are treated as  
> significant.

ES4 is specifying as a bug fix to match other browsers that format- 
control characters shall not be stripped; it must also, to be a real- 
world web standard, specify tolerance for mislocated BOMs. Postel's  
Law bites back!

So JSON and ES4 will agree on this one.

> (3) There are JSON programs that can be parsed to ES but that  
> cannot be serialized back to JSON without losing track of where  
> info was lost.
> JSON does not put any limits on numbers, but ES does.  ES will  
> treat 1e1000 as Infinity.  Since JSON does not have a value  
> Infinity, it is unclear how to implement toJSON(fromJSON("[1e1000]")).

JSON's grammar is nice and simple, it facilitates exhaustive testing  
(Rob Sayre used Koushik Sen's jCUTE to generate all-paths tests for a  
Java implementation).

BigInts or BigNums could help in the future, but the installed base  
will not have them for a while and their literal syntax, without a  
pragma, will have a suffix.

This kind of edge case is unlikely to be a problem in practice,  
although such "overflow" conditions recur throughout the security  
exploit literature. Could JSON stand to grow support for the IEEE-754  
non-finite values?

-------------- next part --------------
An HTML attachment was scrubbed...

More information about the Es4-discuss mailing list