proposed relationships of Secure EcmaScript, ES3.1, and ES4.

Brendan Eich brendan at mozilla.org
Thu Feb 21 00:23:12 PST 2008


On Feb 20, 2008, at 6:10 PM, Mike Samuel wrote:

> >     JSON ⊂ ADsafe ⊂ Cajita ⊂ Caja ⊂ ES3 ⊂ ES4
>
> People who know Unicode are dangerous ;).

Yes, we need more of you ;-).

> There's three problems according to my reading of http:// 
> www.ietf.org/rfc/rfc4627.txt but only the first is directly related  
> to syntax:
>
> (1) There are JSON programs that are not valid ES programs.
> The JSON program [ "\u2028" ] where the unicode escape is replaced  
> with its literal equivalent is valid according to JSON since the  
> set of characters that can appear in a string unescaped is
> unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
> but ES does not allow codepoint 0x2028 or 0x2029 to appear  
> unescaped in a string since they are newline characters.

I wonder if JSON should not change on this point. Is there a use-case  
for unescaped line/paragraph separators in strings?

> (2) There are JSON programs that have the same text as ES programs  
> but different meaning.
> ES262 says that all format control codepoints, such as 0x200C,  
> should be stripped out of the program in a pre-lex phase.  This is  
> not consistently implemented:
>     eval("'\u200c'.length") == 0 on SpiderMonkey, and 1 on most  
> other interpreters

Not lately, meaning post-Firefox-2/JS1.7. Fresh js shell, same  
results for Firefox 3 any beta:

js> eval("'\u200c'.length") == 0
false
js> eval("'\u200c'.length")
1

See https://bugzilla.mozilla.org/show_bug.cgi?id=274152, where  
SpiderMonkey yields to IE JScript's flouting of ECMA-262. IE set a  
real-world web standard, and for the better according to people in  
certain locales.

According to https://bugzilla.mozilla.org/show_bug.cgi?id=368516#c34,  
IE does not report illegal character errors correctly, instead  
treating misplaced BOMs as identifiers whose references result in  
runtime ReferenceErrors (I don't know what it does with other format- 
control characters that occur outside of strings and regexps).

See also the follow-on bug to tolerate mislocated BOMs, https:// 
bugzilla.mozilla.org/show_bug.cgi?id=368516. Ain't the copy/paste  
Internet grand?

> JSON does not strip these characters out, so they are treated as  
> significant.

ES4 is specifying as a bug fix to match other browsers that format- 
control characters shall not be stripped; it must also, to be a real- 
world web standard, specify tolerance for mislocated BOMs. Postel's  
Law bites back!

So JSON and ES4 will agree on this one.

> (3) There are JSON programs that can be parsed to ES but that  
> cannot be serialized back to JSON without losing track of where  
> info was lost.
> JSON does not put any limits on numbers, but ES does.  ES will  
> treat 1e1000 as Infinity.  Since JSON does not have a value  
> Infinity, it is unclear how to implement toJSON(fromJSON("[1e1000]")).

JSON's grammar is nice and simple, it facilitates exhaustive testing  
(Rob Sayre used Koushik Sen's jCUTE to generate all-paths tests for a  
Java implementation).

BigInts or BigNums could help in the future, but the installed base  
will not have them for a while and their literal syntax, without a  
pragma, will have a suffix.

This kind of edge case is unlikely to be a problem in practice,  
although such "overflow" conditions recur throughout the security  
exploit literature. Could JSON stand to grow support for the IEEE-754  
non-finite values?

/be
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.mozilla.org/pipermail/es-discuss/attachments/20080221/a82aba2b/attachment-0002.html 


More information about the Es4-discuss mailing list