Submitted for your approval, JSOX

Mike Samuel mikesamuel at gmail.com
Thu Sep 20 18:07:11 UTC 2018


On Thu, Sep 20, 2018 at 12:48 PM J Decker <d3ck0r at gmail.com> wrote:

>
>
> That could occur in a stream.  (Although if it's a stream I would expect
> it to come in on a websocket rather than any sort of request).... But
>
>    someText{a,b,c}[1,2,3][1,2,3]
>
>     [1][2]
>
> Those are valid streams of objects... How would  '][' be used?
>

<!-- alerts "Intercepted [1,2,3]" -->
<script>
// Attacker setup
let someText, a, b, c;
Object.defineProperty(
  Array.prototype, 3,
  {
    get() {
      alert(`Intercepted ${ JSON.stringify(this) }`)
    }
  });
</script>
<script>
// Loaded cross origin from victim
someText
{a,b,c}[1,2,3][1,2,3]
</script>

If I own an origin and load the victims JSON cross-origin, I can use
getters on Object and Array.prototype to get any object that is square
bracket dereferenced.

You might notice that my setup declares variables to avoid "Undefined
reference to someText" errors.  This could be mitigated by adding an
unpredictable field name to the first type definition for every response so
that the attacker always gets an "Undefined reference error".

someText
{a,b,c,R4nD0m}[1,2,3][1,2,3]

is not vulnerable since, although it is well-formed JS, evaluation fails
before the intercepted array is constructed.


I converted the non-identifier character test to a bit lookup and applied
> it in the JS parser. ( fixed like =,+,-,!,~,(,),<,>,... in unquoted
> contexts) but speaking of quotes a variant allowed is back-tick quoting...
> ` ` ; without the template/code aspects that implies with ES6.  what about
> content that's like
>
> { asdf : "hello
> world" }
> (\n literal is allowed to be collected, and/or \r) but JS would fault on
> multiline non-escaped-continuation...
>
> But I've been reflecting on something you said 'custom types'.
> I'm thinking of implementing basically typed-strings.  <identifier> " ...
> "  (or like "abc""reconstructiondata" ); and registering fromJSOX handlers
> on the parser.  Which would be like parser.registerFromJSOX( "someType",
> function (string) { /* use string to create a thing */ } ).
> Types like 'color' might want to just emit as '#RRGGBBAA' with a toJSOX...
> but really be separate color channels internally.
>

I don't see any immediate security consequences to custom literals.  People
tend not to put side-effects in literal-ish types' constructors.
I would hope that developers would know to treat literal-ish types like
github.com/WICG/trusted-types with suspicion if it travels across a
security boundary or implement some signature checking scheme, and there's
no novel risk around unwisely registering a type to deserialize via JSOX
than via JSON revivers.

If you're already against global registries, please ignore the rest of this
comment.

Since JavaScript is now used for large systems with many modules from
different authors, it helps to be able to scope things to a module.

It's much harder to build secure systems when we can't reason about
security properties of modules in isolation.
When I, as a security reviewer, encounter a module that uses JSOX, I might
enumerate the types it deserializes and check that it vets those before
making auth decisions based on their content.
But if an application loads another module alongside the first which
registers a global JSOX handler, that reasoning may no longer be valid
since an input pipe to the first module may now include objects of types
its authors didn't forsee.
That means I have to treat any uses of registerFromJSOX that affect
parser's globally as a system-level hazard, not just a module-level hazard.
TLDR: many interesting security properties depend on human judgement;
humans can't do whole program analysis for most programs, so global
registries are effectively open sets; open sets complicate conservative
analyses which are often necessary for sound security reasoning.

https://github.com/mikesamuel/unduck/blob/HEAD/API.md (explainer
<https://github.com/mikesamuel/unduck/blob/master/README.md>) is addressing
the flip side of some of JSOX and I managed to do without global registries.
There i used a composition pattern that brings a base object with an empty
registry into scope, and a registration method that returns a copy with a
larger registry.
{
    let ud = require('unduck')
    ud = ud.withTypes({ /* type description */ })
    // more of the same

    // Alternatively
    // let ud = require('unduck')
    //    .withTypes(...)
    //    .withTypes(...);

    // Apply
    ud(/* untrusted input */)
}






>
>
>>
>>
>>>
>>>>
>>>>>
>>>>> This allowed piggybacking on HTTP credentials if an attacker could get
>>>>> a victim to visit their page.
>>>>>
>>>>> The problem was that the meaning of [...] and {...} were specified in
>>>>> terms of global.Array and global.Object
>>>>> which could be replaced
>>>>>
>>>>> That's been fixed, but JSOX should probably be careful about any
>>>>> ambiguity with BlockStatement.
>>>>> IIUC,
>>>>>   { keyword: [] }
>>>>> is valid as a statement so there is some ambiguity there.
>>>>>
>>>>> Then I see examples like
>>>>>
>>>>> //-- the following...
>>>>> a { firstField, secondField }
>>>>> [ a { 1, 2 }, a{5,6}, a{"val1","val2"} ]
>>>>>
>>>>> Ya, it's tempting to type parenthesis after an identifer (fixed above)
>>>>
>>>> [ {firstField:1, secondField:2 }, {firstField:5,secondField:6}, {firstField:"val1",secondField:"val2"} ]
>>>>
>>>> But that doesn't really generate any more data; (similar strings get
>>>> collapsed?)... and  on parsing, there's only one reference to 'firstField'
>>>> and 'secondField'... I was trying to ponder scenarios where the data grows
>>>> unbounded... but even in a case where there's a reference like
>>>>
>>>> [ {a:1, b:2 }, [ref[0],ref[0]], [ref[1],ref[1]], [ref[2],ref[2]] ]
>>>>
>>>> [ {a:1,b:2}, [ {a:1, b:2}, {a:1,b:2} ], [ [{a:1, b:2}, {a:1,b:2}],[{a:1, b:2}, {a:1,b:2}] ], [ [ [{a:1, b:2}, {a:1,b:2}],[{a:1, b:2}, {a:1,b:2}] ], [ [{a:1, b:2}, {a:1,b:2}],[{a:1, b:2}, {a:1,b:2}] ] ] ]
>>>>
>>>> But it's not really replicated data, in the meta data between parsing
>>>> and object assembly, it's an array of strings/numbers; and resolves to
>>>> pointers to existing data.
>>>>
>>>>
>>>> I haven't worked through your grammar, but I wonder whether a naive
>>>>> JSOX encoder might produce output like
>>>>>     { looksLikeAStatementLabel: a{"val1", "val2"} }
>>>>>
>>>>
>>>> (yes, but not parentheses.  Because parens are not control characters,
>>>> the end up being gatherable into identifiers)
>>>>
>>>> or
>>>>>     a
>>>>>     { onlyField }
>>>>>
>>>>
>>>> The current parsing will drop 'onlyField' and result with {}.
>>>> It only 'pushes' the value into the container if there is a value.
>>>>
>>>> It was previously is a parsing error, no value for field... 'expected
>>>> ':' and a value' sort of thing; But I ran into '{}' which is a similar
>>>> parsing state...
>>>>
>>>>
>>>>
>>>>>     [ a(5), a("val1") ]
>>>>> allowing an attacker to do
>>>>>     <script>
>>>>>     let onlyField = null;
>>>>>     function a(...data) {
>>>>>       alert(`Got ${ data }`);
>>>>>     }
>>>>>     </script>
>>>>>     <script src="http://other-origin/jsox-web-service"></script>
>>>>>
>>>>> There's a lot of "ifs" in this scenario,
>>>>> AND CORS solves a lot of these problems for origins that use it
>>>>> AND browsers are less trusting of script srcs with
>>>>> Content-types:text/x-jsox than they were in 2008
>>>>> BUT
>>>>>     // attacker setup
>>>>>     let onlyField = null;
>>>>>     function a(...data) {
>>>>>       alert(`Got ${ data }`);
>>>>>     }
>>>>>     // victim responds
>>>>>     a
>>>>>     { onlyField }
>>>>>     [ a(5), a("val1") ]
>>>>> does alert twice in Chrome and JSON hijacking was exploited in the
>>>>> wild, serializers have been known to
>>>>> line wrap in attacker-controllable ways, and there may still be many
>>>>> JSON webservices that respect ambient
>>>>> credentials on cross-origin requests.
>>>>>
>>>>
>>>> In the first case a(5) turns out to be a valid identifier, which is
>>>> also sort of a string, and the second one would fault finding a " in the
>>>> middle of a identifier... string-string is never allowed... "a""b"; but I
>>>> see... it does depend on how parsing is implemented; grabbing the values
>>>> with a regexp could do that.
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>> This does stick to JSON's spirit of only transporting data.  The
>>>>>> parser is very similar to a JSON parser, except many places that would
>>>>>> previously throw are accepted....
>>>>>> And references can only link to other objects/arrays within the
>>>>>> current outermost object/array.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>> This already happens with plain JSON
>>>>>>> <https://medium.com/@mikesamuel/protecting-against-object-forgery-2d0fd930a7a9>,
>>>>>>> so anything that allows external inputs to specify which internal types to
>>>>>>> construct would have to include a "Security Considerations" section that
>>>>>>> explains how this could be safely used by code that assumes that `if (x
>>>>>>> instanceof InternalType)` then x came from internal code that made a
>>>>>>> good-faith effort to only pass appropriate inputs to `new
>>>>>>> InternalType(...)`.
>>>>>>>
>>>>>>> On Tue, Sep 18, 2018 at 5:22 PM J Decker <d3ck0r at gmail.com> wrote:
>>>>>>>
>>>>>>>> (Thank you Rod Sterling)
>>>>>>>>
>>>>>>>> But seriously, I'd like to submit, for serious consideration, JSOX
>>>>>>>> - JavaScript Object eXchange format.  It inherits all JSON syntax such that
>>>>>>>> it is able to process any existing JSON.
>>>>>>>>
>>>>>>>> I'm, at this point, open to changing anything (or even omitting
>>>>>>>> things), including the name.
>>>>>>>>
>>>>>>>> JSON is great.  JSON has some limits, and criticisms... JS/ES Grew
>>>>>>>> , but JSON has to stay the same, similarly with whatever comes next I'd
>>>>>>>> imagine.
>>>>>>>>
>>>>>>>> So a primary goal is to encode and decode ES6 objects for transport
>>>>>>>> with a simple API such as JSOX.parse( object ), and JSOX.stringify(
>>>>>>>> jsoxString ).  But also keep with the simplicity of JSON,
>>>>>>>> so it can be used in human readable circumstances.
>>>>>>>>
>>>>>>>> Types that are now (or soon) native to ES such as TypedArrays
>>>>>>>> (binary data), BigInt types, and even the existing Date type, do not
>>>>>>>> transport with JSON very well.  They become a non-identifable string, that
>>>>>>>> requires extra code involving knowledge of the structure of the data being
>>>>>>>> transferred to be able to restore the values to Date(), BigInt(), et al.
>>>>>>>>
>>>>>>>> So a few weeks ago I started considering what else, beyond these
>>>>>>>> simple modifications might also be useful, or address criticisms of JSON.
>>>>>>>> Handling the above types is really a trivial modification to most JSON
>>>>>>>> parsers.  Each of the following modifications is really only a very slight
>>>>>>>> change to behavior; although implementing typed-objects does initially
>>>>>>>> involve changing error handling into identifer-fallback handling.
>>>>>>>>
>>>>>>>> I initially argued, that defining a object prototype
>>>>>>>> 'card(name,address,zipcode,created)' which removes the redundant data for
>>>>>>>> every following reference, (and is good, just for data reduction, which was
>>>>>>>> argued 'gzip').  A JSON representation might be
>>>>>>>> `{"name":"bob","address":"123
>>>>>>>> street","zipcode":"55555","created":1537304820} where if have a large
>>>>>>>> number of the same record the same 'name':,'address':, etc is repeated in
>>>>>>>> every record.  Where a typed-object's value in JSOX could be
>>>>>>>> `card{:"bob","123 street","55555",2018-09-18T21:07:00Z}`.  All objects that
>>>>>>>> are revived as typed-objects share the same prototype, and before parsing,
>>>>>>>> the prototypes to be used may be specified.  The amount of data to process
>>>>>>>> is reduced, perhaps to a significant degree.
>>>>>>>>
>>>>>>>> So <Identifer> '{' is about typed-objects.  This construct is not
>>>>>>>> allowed in JSON.  But that then leads to <Identifier> '['  - typed arrays,
>>>>>>>> arrays don't really have redundant data potential like objects, but there
>>>>>>>> are TypedArrays in ES.  There is no way to define a type of an array, but
>>>>>>>> hardcoded types like 'ab', 'u8', 'ref' are used to revive binary data.  The
>>>>>>>> bytes of the backing ArrayBuffer are encoded to base64, and included within
>>>>>>>> '[' and ']' without quotes; using the brackets as quotes.
>>>>>>>>
>>>>>>>> A JSOX typed array is the 'ref' type.  A reference to another
>>>>>>>> location in the current object can be specified, which allows encoding
>>>>>>>> cyclic structures.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> https://github.com/d3x0r/jsox
>>>>>>>> https://npmjs.com/package/jsox
>>>>>>>>
>>>>>>>> (Initial public reaction was not very helpful, but probably that's
>>>>>>>> the fault of how it was introduced?)
>>>>>>>>
>>>>>>>> https://www.reddit.com/r/javascript/comments/9f8wml/jsox_javascript_object_exchange_format_preview/
>>>>>>>>
>>>>>>>> There was plenty of 'why not [YAML/BSON/protobufs/(I don't think
>>>>>>>> anyone said XML)/...]'  and the answer is simply, because none of those
>>>>>>>> read JSON, or have as simple of an API. (amongst other reasons that JSON is
>>>>>>>> already a solution for compared to those mentioned)
>>>>>>>> _______________________________________________
>>>>>>>> es-discuss mailing list
>>>>>>>> es-discuss at mozilla.org
>>>>>>>> https://mail.mozilla.org/listinfo/es-discuss
>>>>>>>>
>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20180920/980a3e3a/attachment-0001.html>


More information about the es-discuss mailing list