ES4 Security

Steven Mascaro subs at voracity.org
Tue May 20 00:34:20 PDT 2008


On Tue, May 20, 2008 at 8:21 AM, Brendan Eich <brendan at mozilla.org> wrote:
> Security is an "issue" (not in the Oprah sense), all right. A bit more
> precisely, it is a set of end-to-end properties that need solid enforcement
> mechanisms at many layers of abstraction. Security is however not a unitary
> good or "product".
>
> So we should talk about specific, well-defined properties that can be
> verified somehow (whether statically or not).

I'm happy to do that, but my argument begins and is driven by vague
instinct. In a sense, I have a Lakatosian-style hard-core theory (that
I don't know how to put into words), and I'm trying to work out if I
should abandon it or simply make changes to the soft-core of specific,
well-defined properties around it.

> No MITM -- think mashups, user-generated content hosting (myspace, lj,
> etc.). Firefox and other Mozilla-based apps are indeed the ur-mashup (code
> and markup loaded from various origins mixing in a single JS runtime).

Domain-level controls will obviously be ineffective because they all
live on the same domain (and in some cases, the very same page). I
suggested the use of sandboxes here:

https://bugzilla.mozilla.org/show_bug.cgi?id=178993#c55

(which requires random boundaries as Ken mentions in the bug). I'm
roughly thinking that sandboxes == domains from a conceptual point of
view. Without those kind of controls, though, Javascript will struggle
to plug all the possible problems with just RO/DD/IH. What's more,
it's a lot easier for a developer to understand the concept "cordon
off the risk" than it is for her to understand "make some things
read-only, other things don't delete and hide the info you don't want
showing, but don't do that with everything because you'll want to
modify some of your own things under certain conditions that probably
won't be obvious until you need to do it".

> For example, suppose that it were possible to retrieve the text of any
> <script src="..."></script> element using '.textContent' from
> javascript, regardless of origin. You'll agree that this is
> unthinkable today. But I assume you'll also agree that there is no
> security problem in doing this if no cookies (or other private data)
> are sent in the initial request to retrieve the script page?
>
> Absolutely not. Why do you think that would be safe? Shaver's followup shows
> a "Princeton attack" against inside-the-firewall known-intranet-address
> data. That's just one problem. You can blame the firewall or doc-hosting
> server admin, but it's a real-world problem.

I've replied to Shaver's example. If you consider the response
inadequate, I'd be happy to read your rebuttal there.

> In the Netscape 3 data tainting model, with a much simpler DOM, properties
> of equivalent asset value would be tainted with the page's origin. They
> could freely flow into scripts loaded from other origins, but the results
> (*any* results) could not leak back (feasibly, or at least in practically
> cheap-enough flows) to the other origins' servers. But as I've noted in the
> last message and in recent talks about this experiment, the purely dynamic
> information flow system cannot untaint the virtual machine pc, so taint
> mixtures accumulate badly. The system is too pessimistic, for want of static
> analysis -- a hybrid approach I'm experimenting with now.

Data-tainting sounds as if it is not only extremely complex to
implement, but seems as if it would be extremely difficult for most
developers to understand. (Don't they have to have an understanding of
where the flows go, too?) The new postMessage method sounds like a
great, simple way to punch small holes in cross-site firewalls.
(That's not to say data-tainting shouldn't be looked into.)

> In the same-origin model under which we all suffer still, there's no taint
> or trust label associated exactly with .textContext's value, so to protect
> this asset, ignoring cookie leaks, we would be relying (as we do for all
> such DOM assets today) on the access-control checks done between trust
> containers (window objects, "frames" in general). This is a bad bet, because
> browsers constantly patch bugs in their leaky inter-frame reference
> monitors. This is the bigger problem I alluded to above ("That's just one
> problem").

I'm not sure what the specific problem is that your describing, but
I'll try to reason it through. Consider 2 windows (frames, whatever),
A and B. Javascript restrictions will be of little use under the
following conditions:
1) If there is a browser flaw such that Window A contains internal
references to properties in Windows B.
2) If there is a browser cross-domain permissions flaw that allows
Window A to get/set some properties in Window B.
3) If there is a mashup-style same domain attack --- for example,
myspace.com/john includes an iframe/popup containing myspace.com/jane.
(Sandboxes prevent Jane attacks on John. They don't prevent John
attacks on Jane --- though neither will Javascript controls; I imagine
form.submit() will always be possible.)

But I feel you have something else in mind.

> Right, and cookies are a hot point of contention in the W3C Web API working
> group, and among Mozillans, right now. We pulled XS-XHR based on the draft
> recommendation from Firefox 3 in order to get it right, and avoid spreading
> a de-facto standard before the de-jure one was finished (this is a chicken
> and egg situation; some on the working group would want us to set precedent
> -- I think that's wrong, but I understand it, and so [evidently] does
> Microsoft).

Mozilla's course of action seems reasonable.

> This is getting far afield from es4-discuss, however. Suggest we take it to
> a w3c public list, if there is one.

Unfortunately, this topic cuts across too many subjects. (Did I just
make an AOP joke?)

> If there are non-cookie examples of XSS, please point me to them
> (setting aside equivalents like DOM storage, but also the :visited
> example from below).
>
> Yikes. Where to start?
> https://bugzilla.mozilla.org/buglist.cgi?query_format=specific&order=relevance+desc&bug_status=__open__&content=XSS

Thanks, I will take a look through and see which category I can force
each bug into (either "not solvable by Javascript controls" or
"solvable through other means") ;)

> Once this problem is solved, ES4 *does* *not* need RO/DD/IH for
> security. (IH=information hiding.)
>
> Now you've changed the subject to Integrity.
>
> No, I am talking about Integrity, Confidentiality and more (like
> Authenticity).
>
> Sorry, but you wrote "Once [a Confidentiality problem is solved], ES4 *does*
> *not* need [Integrity features]". I disagreed -- even excluding the
> programming-in-the-large reasons for those Integrity features -- and gave my
> reasons. Cookies are not the only or even the main pain point.

I said the problem solution was to "not automatically transmit private
information". Since I don't think in terms of Hexads, by that phrase I
meant data, credentials and control signals (and anything else that
can be represented by bits). I appreciate why this would have been
confusing, but hopefully my intention is clearer now.

> Who, in that case, are you developing ES4 for? Theoreticians?
>
> This is a cheap shot. C'mon.

Yes, I'll do anything for a laugh :) I was merely suggesting that
sometimes we get so excited by the details of what we're doing, we
forget what's important to those we're trying to help.

> ...snip discussion on typing, duck typing, etc....

I agree we should take this up in another thread (which is why I have
tried to restrict the scope of my comments).

But I will just quickly say that I agree with almost everything you've
said. The only exception being that I would want unwanted mutations
and cheap name hiding to be suppressible (not impossible). Strict mode
should prevent compilation/running. 'Lax' mode should warn you (even
if you're accessing strict mode code) and allow you to suppress the
warning by using something like PHP's @ operator. (And I've used that
operator quite a few times...)

But this won't fly if the reason for preventing mutations is security
(or whatever concept you prefer).

> After reading up on implicit flows, I still have no idea what
> 'tainting the pc' means. In any event,
>
> Consider this code:
>   var A = 0, C = 0;
>   var B = secret;
>   if (B) {
>     A = 1;
>   } else {
>     C = 1;
>   }
>   submit(C);
> In unsound dynamic-only data tainting, if B (secret) is true, then A will be
> tainted with the secret's label, else C will be. But this misses the
> "implicit flow" (Dorothy Denning, SOSP 1976 -- an OS journal, that's where
> the action was then!) through the else or then clause not taken and into the
> other (C or A) variable.
> So in sound dynamic data tainting, one must associate a trust label with the
> state of execution, call it the program counter (pc), and taint the pc when
> control flow forks based on a secret (on any data whose trust label is not
> the pc's label). The tainted pc then taints effects in all paths that could
> be taken -- this handles implicit flows.
> The problem then becomes untainting the pc. Netscape 3's optional security
> model could not untaint, lacking static analysis of all possible effects
> committed by alternative control flow graph paths (liveness analysis can
> help too). Such analyses are hard without more certainty about name binding
> and data types.
> This is an interesting area of research, which we're pursuing at Mozilla in
> partnership with some academic researchers. I'll have more to say about it
> shortly.

(I believe that's the paper I had found originally.) It does sound
extremely interesting. And I do see how static analysis would speed
things up, but I don't see why dynamic analysis wouldn't be possible
(although I can see it being too slow to be practical).

> But if no cookies are sent, there is no problem. For example, suppose
> evilsite.com did the following instead:
> <script
> src="evilsite.com?graburl.cgi?loc=https://mail.victimsite.com/address-book.json"></script>
> evilsite.com can do that today, and will forever be able to do that.
> There is no problem, though, because the 'graburl.cgi' script can't
> send the user's cookies to victimsite.com. I don't understand why
> there is any confusion about this.
>
> There's no confusion. You are simply assuming proper authentication, which
> is over-optimistic (Murphy was an optimist). Joe Walker's blog did not get
> read by everyone using JSON across origins. Reality bites (major sites have
> had firedrills).

I can only comment on the ones I've heard about, which would have been
solved by not sending/accepting cookies.

> Defense in depth requires more than one line of defense. This is why
> cookie-sending in XS-XHR is a hot issue. Will existing sites fail to
> authenticate where they should? Will they naively extend their existing auth
> systems from XHR to XS-XHR, compromising non-public data? Will mashup sites
> over-auth too much thereby training users to re-enter reusable credentials?
> These are hard, ultimately human-factors problems. The solutions are not
> obvious, and the difficulties here are vexing the Web API committee, or at
> least the members of it I know.

I guess I'm an optimist because I believe there are solutions, but I
won't try to reason them out right now for the sake of length.

> To get away from XS-XHR and back to JSON data and JS/ES: Jesse's whole point
> in filing that Mozilla bug was to plead for greater defense in depth,
> specifically greater Integrity for object and array initialisers.

Defense-in-depth can be carried too far (e.g. noscript.net).

> I believe you can't 'prove' much that's interesting, anyway... (again,
> that's why I don't believe in the utility of code verification).
>
> Bill Gates begs to differ (device driver verification, also great work from
> various MSR researchers), and we're seeing wins from static analysis in the
> Mozilla world. Not full program verification, mind you, but formal methods
> should not be dismissed wholesale based on poor results from the past.
> Proving correctness and soundness are hard, but checking programmer
> assumptions and would-be invariants -- that's what ES4 makes easier with its
> gradual types. That kind of "model-checking" or "shape testing" is also what
> a lot of JS code, and a lot of Mozilla code (including C++ code) needs.
> You don't have to "believe in the utility" of anything. Either analysis is
> useful, or it isn't. It's clear that checking latent types in JS, or
> checking other invariants that can't be expressed via the weak notion of
> types in JS and the DOM today, is hard enough that Ajax libraries spend a
> great many lines of code on it, with standardized patterns and helpers. Such
> code patterns are signs of gaps in JS1/ES3, and the Ajax libraries can't
> fill these gaps completely (e.g., because they can't guarantee name bindings
> or seal objects against mutation).

What I meant is that code verification is often more pain than it's
worth (so not just utility, but expected utility). In most cases, the
problem isn't whether your code implements some spec perfectly but
whether the code/spec will work "in the real world". This requires
(statistical) testing, which almost always bring to light spec
implementation flaws anyway.

If code verification doesn't cost any time, resources, etc. then I
have no problem with it. Alternatively, if you need to do it because
it is an essential part of the solution, then I also have no problem
with it (many cases of 'typeof'-style uses fall into this category).
And if you've only got one shot to get things right --- like sending
people into space --- useful again. But often that's not the case.

Just as an example, I (vaguely) remember being taught that to make
code analysis possible, it was necessary for functions to contain only
one return statement, right at the end. I've no idea if this is still
true, but the point is that in order to 'buy' code analysis, I had to
'pay' by writing denser and more convoluted code, which I would have
to extensively test anyway.

> In the message I found, you mention logging (and other post-hoc
> instrumentation) as a potential use for AOP. Being from a simulation
> background, I'd find statistics a compelling use. Being a programmer,
> I'd find debugging a compelling use. Being a hacker (not cracker!),
> I'd find modifying other people's code another compelling use.
>
> Not to digress again, but the conclusion Mitch reached, in his summary
> words, was "sound plugin APIs good, AspectJ-like hacks bad".
> Debugging is a whole other can of worms. It must be able to violate
> abstractions. But let's leave it for another day. It should be sufficient to
> note here that debuggers are privileged in ways ordinary ES code, even with
> ES4 reflection APIs being proposed, is not.

Plugin APIs don't work if you don't know which points need to be
plugged into ahead of time. The main advantage of plugin APIs is being
able to separate stable from unstable parts of the code across
'versions'. (But if you really want to access, for example, 'private'
webkit functions, you should have that choice.)



More information about the Es4-discuss mailing list