AST in JSON format

Oliver Hunt oliver at apple.com
Tue Dec 8 22:02:34 PST 2009


On Dec 8, 2009, at 8:51 PM, Mark S. Miller wrote:

> On Tue, Dec 8, 2009 at 7:59 PM, Oliver Hunt <oliver at apple.com> wrote:
>> Providing an AST doesn't get you anything substantial here as
>> the hard part of all this is validation, not parsing.
> 
> Given ES5 as a starting point,
> 1) validation for many interesting purposes, especially security, is
> no longer hard,
> 2) the subset restrictions need no longer be severe, and
> 3) the issue isn't what's hard but what's slow and large. Lexing and
> parsing JS accurately is slow. Accurate JS lexers and parsers are
> large. Even if JS is now fast enough to write a parser competitive
> with the one built into the browsers, this parser would itself need to
> be downloaded per frame. Even if all downloads of the parser code hit
> on the browser's cache, the parser would still need to be parsed per
> frame that needed it (unless browsers cache a frame-independent parsed
> representation of JS scripts).
> 
> I am currently working on just such a validator and safe execution
> environment -- assuming ES5 and a built in parser->AST. Going out on a
> limb, I expect it to have a small download, a simple translation, no
> appreciable code expansion, and no appreciable runtime overhead. Once
> I've posted it, we can reexamine my claims above against it.
> 
> 
>> Realistically you would want the browser to be responsible for
>> validation because it is able to do much more interesting forms of
>> validation,
> 
> What are these more interesting forms of validation?

I think basically i took Breton's concept to be (effectively) whitelisting language constructs (to me a reasonable interpretation of his statements) -- the logical step for a end engine to take would be an object capability model of some kind.

> there are in fact already multiple concepts being investigated by
>> the whatwg to solve just this problem, without requiring js
>> subsetting.
> 
> What are these other concepts? I am aware of one -- the sandboxed
> iframe. Compared to JS subsetters, this is flawed in many ways. But
> the more important contrast is that whatwg is investigating security
> frameworks to be centrally designed, by them/us, and then implemented
> and deployed by the browser makers. When they screw up, the rest of us
> downstream have no recourse. By contrast, multiple competing projects
> are trying various approaches to JS subsetting -- Caja, FBJS, MS
> WebSandbox, ADsafe, Jacaranda. By one measure
> <http://www.eros-os.org/pipermail/cap-talk/2009-October/013567.html>,
> these already dominate the Same Origin Policy as the primary isolation
> mechanism on the web.
> 
> This victory happened despite the insane difficulty of doing this on
> an ES3 base. Again, starting from ES5, this becomes vastly easier.

Same origin policy is intended to prevent content from one domain from accessing another, it's not meant to prevent the embedded site from doing lamentable things inside its own context -- same origin is not a concept that it makes sense to involve in an ES spec, at a very basic level you could treat it as an object capabilities model that has the rather simple rule
function actionAllowed(executionContext) {
    return executionContext.origin == ownOrigin;
}

The issue I have with the various object capability models that are layered on top of ES is that (as far as i can tell -- correct me if i'm wrong) they attempt to restrict the language to make validation possible, whereas the many and varied sandbox-themed concepts the whatwg is considering/has considered work on the assumption that ES should not be restricted at all, and failure should only occur when you attempt to something illegal (cross-origin access being the prime example).  The sandboxing rules are simply an extension to this concept, allowing content from the same origin to be restricted as if it were from a separate origin.

>>  Especially given in the mashup scenario you don't just have JS,
>> you have the DOM and html -- assuming you can completely
>> separate html from the js, you're still fairly limited as your
>> validation either prohibits any access to the dom or the validation
>> can be circumvented.
> 
> All the JS subsetters mentioned above mediate access to the dom but do
> not prevent it. The virtualized dom provided by Caja is a sufficient
> emulation of the browser DOM that the YUI library from Yahoo! now
> operates fully cajoled (translated by Caja and accessing the dom only
> via Caja's mediation). Please show how the protections provided by
> Caja can be circumvented.

Do you have a site set up with Caja that I can try out? eg. something where the sole purpose is to allow someone to through random content at it and see what sticks?

The advantage that in-engine validation have over models like Caja and Jacaranda is that there does not need to be any language restrictions (the security constraints can be enforced by the engine at access time, etc);

That said if in future a system such as Caja or Jacaranda (or some other yet to be developed system) turns out to be effective and popular I'm sure some effort will be made to standardise it.

--Oiiver



More information about the es-discuss mailing list