Submitted for your approval, JSOX

Mike Samuel mikesamuel at gmail.com
Wed Sep 19 20:46:32 UTC 2018


On Wed, Sep 19, 2018, 4:41 PM Mike Samuel <mikesamuel at gmail.com> wrote:

>
>
> On Wed, Sep 19, 2018, 4:07 PM J Decker <d3ck0r at gmail.com> wrote:
>
>> (trimmed)
>>
>> On Wed, Sep 19, 2018 at 12:08 PM Mike Samuel <mikesamuel at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Wed, Sep 19, 2018 at 12:01 PM J Decker <d3ck0r at gmail.com> wrote:
>>>
>>>>
>>>> I know of no exploits; all resulting strings should be shorter than the
>>>> input (because of escapes \\ ).  The C version allocates a output buffer
>>>> that is the same size as the input, and moves decoded strings into it.
>>>> Structure characters [ { } ] , " ' `  don't transfer either.
>>>>
>>>
>>> Not a vulnerability in your JSOX implementation per se, but have you
>>> looked into whether there's exploitable ambiguity between JSOX and runs of
>>> ES BlockStatements and ExpressionStatements?
>>>
>>> JSON used to be vulnerable
>>> <https://haacked.com/archive/2008/11/20/anatomy-of-a-subtle-json-vulnerability.aspx/>
>>> to cross-site snooping.
>>>
>>> <script>// In attacker page
>>> Array = function () { alert('Got ' + arguments[0]) };
>>> </script>
>>> <script src="
>>> http://other-origin/some-web-service-that-responds-with-a-json-array
>>> "></script>
>>>
>>
>> Interesting; that applies to JSOX for Number, BigInt, Date, ....
>>
>> Parenthesis (in the C version) fault while collecting an identifier as
>> being a non-identifier character as defined by Unicode Standards....  (as
>> per rules of an identifier in ES6)
>> That lookup was omitted in the JS implementation.  (per character search
>> through several thousand values.)
>>
>> Parenthesis is reserved for code, expressions, parameter specifications,
>> and is (should be) forbidden except in strings.
>>
>
> My apologies.  I thought there were parentheses in the docs on npmjs but
> seeing what I pasted from there on my phone it's obvious that it's all
> curly brackets.
>
> As long as your syntax doesn't include parentheses, dots, or backticks
> you're probably fine.
>

Though I could probably make hay with an output that includes thee token
pair  ] [


>
>>
>>>
>>> This allowed piggybacking on HTTP credentials if an attacker could get a
>>> victim to visit their page.
>>>
>>> The problem was that the meaning of [...] and {...} were specified in
>>> terms of global.Array and global.Object
>>> which could be replaced
>>>
>>> That's been fixed, but JSOX should probably be careful about any
>>> ambiguity with BlockStatement.
>>> IIUC,
>>>   { keyword: [] }
>>> is valid as a statement so there is some ambiguity there.
>>>
>>> Then I see examples like
>>>
>>> //-- the following...
>>> a { firstField, secondField }
>>> [ a { 1, 2 }, a{5,6}, a{"val1","val2"} ]
>>>
>>> Ya, it's tempting to type parenthesis after an identifer (fixed above)
>>
>> [ {firstField:1, secondField:2 }, {firstField:5,secondField:6}, {firstField:"val1",secondField:"val2"} ]
>>
>> But that doesn't really generate any more data; (similar strings get
>> collapsed?)... and  on parsing, there's only one reference to 'firstField'
>> and 'secondField'... I was trying to ponder scenarios where the data grows
>> unbounded... but even in a case where there's a reference like
>>
>> [ {a:1, b:2 }, [ref[0],ref[0]], [ref[1],ref[1]], [ref[2],ref[2]] ]
>>
>> [ {a:1,b:2}, [ {a:1, b:2}, {a:1,b:2} ], [ [{a:1, b:2}, {a:1,b:2}],[{a:1, b:2}, {a:1,b:2}] ], [ [ [{a:1, b:2}, {a:1,b:2}],[{a:1, b:2}, {a:1,b:2}] ], [ [{a:1, b:2}, {a:1,b:2}],[{a:1, b:2}, {a:1,b:2}] ] ] ]
>>
>> But it's not really replicated data, in the meta data between parsing and
>> object assembly, it's an array of strings/numbers; and resolves to pointers
>> to existing data.
>>
>>
>> I haven't worked through your grammar, but I wonder whether a naive JSOX
>>> encoder might produce output like
>>>     { looksLikeAStatementLabel: a{"val1", "val2"} }
>>>
>>
>> (yes, but not parentheses.  Because parens are not control characters,
>> the end up being gatherable into identifiers)
>>
>> or
>>>     a
>>>     { onlyField }
>>>
>>
>> The current parsing will drop 'onlyField' and result with {}.
>> It only 'pushes' the value into the container if there is a value.
>>
>> It was previously is a parsing error, no value for field... 'expected ':'
>> and a value' sort of thing; But I ran into '{}' which is a similar parsing
>> state...
>>
>>
>>
>>>     [ a(5), a("val1") ]
>>> allowing an attacker to do
>>>     <script>
>>>     let onlyField = null;
>>>     function a(...data) {
>>>       alert(`Got ${ data }`);
>>>     }
>>>     </script>
>>>     <script src="http://other-origin/jsox-web-service"></script>
>>>
>>> There's a lot of "ifs" in this scenario,
>>> AND CORS solves a lot of these problems for origins that use it
>>> AND browsers are less trusting of script srcs with
>>> Content-types:text/x-jsox than they were in 2008
>>> BUT
>>>     // attacker setup
>>>     let onlyField = null;
>>>     function a(...data) {
>>>       alert(`Got ${ data }`);
>>>     }
>>>     // victim responds
>>>     a
>>>     { onlyField }
>>>     [ a(5), a("val1") ]
>>> does alert twice in Chrome and JSON hijacking was exploited in the wild,
>>> serializers have been known to
>>> line wrap in attacker-controllable ways, and there may still be many
>>> JSON webservices that respect ambient
>>> credentials on cross-origin requests.
>>>
>>
>> In the first case a(5) turns out to be a valid identifier, which is also
>> sort of a string, and the second one would fault finding a " in the middle
>> of a identifier... string-string is never allowed... "a""b"; but I see...
>> it does depend on how parsing is implemented; grabbing the values with a
>> regexp could do that.
>>
>>
>>>
>>>
>>>
>>>> This does stick to JSON's spirit of only transporting data.  The parser
>>>> is very similar to a JSON parser, except many places that would previously
>>>> throw are accepted....
>>>> And references can only link to other objects/arrays within the current
>>>> outermost object/array.
>>>>
>>>
>>>
>>>
>>>>
>>>>> This already happens with plain JSON
>>>>> <https://medium.com/@mikesamuel/protecting-against-object-forgery-2d0fd930a7a9>,
>>>>> so anything that allows external inputs to specify which internal types to
>>>>> construct would have to include a "Security Considerations" section that
>>>>> explains how this could be safely used by code that assumes that `if (x
>>>>> instanceof InternalType)` then x came from internal code that made a
>>>>> good-faith effort to only pass appropriate inputs to `new
>>>>> InternalType(...)`.
>>>>>
>>>>> On Tue, Sep 18, 2018 at 5:22 PM J Decker <d3ck0r at gmail.com> wrote:
>>>>>
>>>>>> (Thank you Rod Sterling)
>>>>>>
>>>>>> But seriously, I'd like to submit, for serious consideration, JSOX -
>>>>>> JavaScript Object eXchange format.  It inherits all JSON syntax such that
>>>>>> it is able to process any existing JSON.
>>>>>>
>>>>>> I'm, at this point, open to changing anything (or even omitting
>>>>>> things), including the name.
>>>>>>
>>>>>> JSON is great.  JSON has some limits, and criticisms... JS/ES Grew ,
>>>>>> but JSON has to stay the same, similarly with whatever comes next I'd
>>>>>> imagine.
>>>>>>
>>>>>> So a primary goal is to encode and decode ES6 objects for transport
>>>>>> with a simple API such as JSOX.parse( object ), and JSOX.stringify(
>>>>>> jsoxString ).  But also keep with the simplicity of JSON,
>>>>>> so it can be used in human readable circumstances.
>>>>>>
>>>>>> Types that are now (or soon) native to ES such as TypedArrays (binary
>>>>>> data), BigInt types, and even the existing Date type, do not transport with
>>>>>> JSON very well.  They become a non-identifable string, that requires extra
>>>>>> code involving knowledge of the structure of the data being transferred to
>>>>>> be able to restore the values to Date(), BigInt(), et al.
>>>>>>
>>>>>> So a few weeks ago I started considering what else, beyond these
>>>>>> simple modifications might also be useful, or address criticisms of JSON.
>>>>>> Handling the above types is really a trivial modification to most JSON
>>>>>> parsers.  Each of the following modifications is really only a very slight
>>>>>> change to behavior; although implementing typed-objects does initially
>>>>>> involve changing error handling into identifer-fallback handling.
>>>>>>
>>>>>> I initially argued, that defining a object prototype
>>>>>> 'card(name,address,zipcode,created)' which removes the redundant data for
>>>>>> every following reference, (and is good, just for data reduction, which was
>>>>>> argued 'gzip').  A JSON representation might be
>>>>>> `{"name":"bob","address":"123
>>>>>> street","zipcode":"55555","created":1537304820} where if have a large
>>>>>> number of the same record the same 'name':,'address':, etc is repeated in
>>>>>> every record.  Where a typed-object's value in JSOX could be
>>>>>> `card{:"bob","123 street","55555",2018-09-18T21:07:00Z}`.  All objects that
>>>>>> are revived as typed-objects share the same prototype, and before parsing,
>>>>>> the prototypes to be used may be specified.  The amount of data to process
>>>>>> is reduced, perhaps to a significant degree.
>>>>>>
>>>>>> So <Identifer> '{' is about typed-objects.  This construct is not
>>>>>> allowed in JSON.  But that then leads to <Identifier> '['  - typed arrays,
>>>>>> arrays don't really have redundant data potential like objects, but there
>>>>>> are TypedArrays in ES.  There is no way to define a type of an array, but
>>>>>> hardcoded types like 'ab', 'u8', 'ref' are used to revive binary data.  The
>>>>>> bytes of the backing ArrayBuffer are encoded to base64, and included within
>>>>>> '[' and ']' without quotes; using the brackets as quotes.
>>>>>>
>>>>>> A JSOX typed array is the 'ref' type.  A reference to another
>>>>>> location in the current object can be specified, which allows encoding
>>>>>> cyclic structures.
>>>>>>
>>>>>>
>>>>>>
>>>>>> https://github.com/d3x0r/jsox
>>>>>> https://npmjs.com/package/jsox
>>>>>>
>>>>>> (Initial public reaction was not very helpful, but probably that's
>>>>>> the fault of how it was introduced?)
>>>>>>
>>>>>> https://www.reddit.com/r/javascript/comments/9f8wml/jsox_javascript_object_exchange_format_preview/
>>>>>>
>>>>>> There was plenty of 'why not [YAML/BSON/protobufs/(I don't think
>>>>>> anyone said XML)/...]'  and the answer is simply, because none of those
>>>>>> read JSON, or have as simple of an API. (amongst other reasons that JSON is
>>>>>> already a solution for compared to those mentioned)
>>>>>> _______________________________________________
>>>>>> es-discuss mailing list
>>>>>> es-discuss at mozilla.org
>>>>>> https://mail.mozilla.org/listinfo/es-discuss
>>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20180919/e33b965f/attachment-0001.html>


More information about the es-discuss mailing list