Submitted for your approval, JSOX

J Decker d3ck0r at gmail.com
Thu Sep 20 16:47:56 UTC 2018


On Wed, Sep 19, 2018 at 1:46 PM Mike Samuel <mikesamuel at gmail.com> wrote:

>
>
> On Wed, Sep 19, 2018, 4:41 PM Mike Samuel <mikesamuel at gmail.com> wrote:
>
>>
>>
>> On Wed, Sep 19, 2018, 4:07 PM J Decker <d3ck0r at gmail.com> wrote:
>>
>>> (trimmed)
>>>
>>> On Wed, Sep 19, 2018 at 12:08 PM Mike Samuel <mikesamuel at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Sep 19, 2018 at 12:01 PM J Decker <d3ck0r at gmail.com> wrote:
>>>>
>>>>>
>>>>> I know of no exploits; all resulting strings should be shorter than
>>>>> the input (because of escapes \\ ).  The C version allocates a output
>>>>> buffer that is the same size as the input, and moves decoded strings into
>>>>> it.  Structure characters [ { } ] , " ' `  don't transfer either.
>>>>>
>>>>
>>>> Not a vulnerability in your JSOX implementation per se, but have you
>>>> looked into whether there's exploitable ambiguity between JSOX and runs of
>>>> ES BlockStatements and ExpressionStatements?
>>>>
>>>> JSON used to be vulnerable
>>>> <https://haacked.com/archive/2008/11/20/anatomy-of-a-subtle-json-vulnerability.aspx/>
>>>> to cross-site snooping.
>>>>
>>>> <script>// In attacker page
>>>> Array = function () { alert('Got ' + arguments[0]) };
>>>> </script>
>>>> <script src="
>>>> http://other-origin/some-web-service-that-responds-with-a-json-array
>>>> "></script>
>>>>
>>>
>>> Interesting; that applies to JSOX for Number, BigInt, Date, ....
>>>
>>> Parenthesis (in the C version) fault while collecting an identifier as
>>> being a non-identifier character as defined by Unicode Standards....  (as
>>> per rules of an identifier in ES6)
>>> That lookup was omitted in the JS implementation.  (per character search
>>> through several thousand values.)
>>>
>>> Parenthesis is reserved for code, expressions, parameter specifications,
>>> and is (should be) forbidden except in strings.
>>>
>>
>> My apologies.  I thought there were parentheses in the docs on npmjs but
>> seeing what I pasted from there on my phone it's obvious that it's all
>> curly brackets.
>>
>> As long as your syntax doesn't include parentheses, dots, or backticks
>> you're probably fine.
>>
>
> Though I could probably make hay with an output that includes thee token
> pair  ] [
>

That could occur in a stream.  (Although if it's a stream I would expect it
to come in on a websocket rather than any sort of request).... But

   someText{a,b,c}[1,2,3][1,2,3]

    [1][2]

Those are valid streams of objects... How would  '][' be used?

I converted the non-identifier character test to a bit lookup and applied
it in the JS parser. ( fixed like =,+,-,!,~,(,),<,>,... in unquoted
contexts) but speaking of quotes a variant allowed is back-tick quoting...
` ` ; without the template/code aspects that implies with ES6.  what about
content that's like

{ asdf : "hello
world" }
(\n literal is allowed to be collected, and/or \r) but JS would fault on
multiline non-escaped-continuation...

But I've been reflecting on something you said 'custom types'.
I'm thinking of implementing basically typed-strings.  <identifier> " ...
"  (or like "abc""reconstructiondata" ); and registering fromJSOX handlers
on the parser.  Which would be like parser.registerFromJSOX( "someType",
function (string) { /* use string to create a thing */ } ).
Types like 'color' might want to just emit as '#RRGGBBAA' with a toJSOX...
but really be separate color channels internally.


>
>
>>
>>>
>>>>
>>>> This allowed piggybacking on HTTP credentials if an attacker could get
>>>> a victim to visit their page.
>>>>
>>>> The problem was that the meaning of [...] and {...} were specified in
>>>> terms of global.Array and global.Object
>>>> which could be replaced
>>>>
>>>> That's been fixed, but JSOX should probably be careful about any
>>>> ambiguity with BlockStatement.
>>>> IIUC,
>>>>   { keyword: [] }
>>>> is valid as a statement so there is some ambiguity there.
>>>>
>>>> Then I see examples like
>>>>
>>>> //-- the following...
>>>> a { firstField, secondField }
>>>> [ a { 1, 2 }, a{5,6}, a{"val1","val2"} ]
>>>>
>>>> Ya, it's tempting to type parenthesis after an identifer (fixed above)
>>>
>>> [ {firstField:1, secondField:2 }, {firstField:5,secondField:6}, {firstField:"val1",secondField:"val2"} ]
>>>
>>> But that doesn't really generate any more data; (similar strings get
>>> collapsed?)... and  on parsing, there's only one reference to 'firstField'
>>> and 'secondField'... I was trying to ponder scenarios where the data grows
>>> unbounded... but even in a case where there's a reference like
>>>
>>> [ {a:1, b:2 }, [ref[0],ref[0]], [ref[1],ref[1]], [ref[2],ref[2]] ]
>>>
>>> [ {a:1,b:2}, [ {a:1, b:2}, {a:1,b:2} ], [ [{a:1, b:2}, {a:1,b:2}],[{a:1, b:2}, {a:1,b:2}] ], [ [ [{a:1, b:2}, {a:1,b:2}],[{a:1, b:2}, {a:1,b:2}] ], [ [{a:1, b:2}, {a:1,b:2}],[{a:1, b:2}, {a:1,b:2}] ] ] ]
>>>
>>> But it's not really replicated data, in the meta data between parsing
>>> and object assembly, it's an array of strings/numbers; and resolves to
>>> pointers to existing data.
>>>
>>>
>>> I haven't worked through your grammar, but I wonder whether a naive JSOX
>>>> encoder might produce output like
>>>>     { looksLikeAStatementLabel: a{"val1", "val2"} }
>>>>
>>>
>>> (yes, but not parentheses.  Because parens are not control characters,
>>> the end up being gatherable into identifiers)
>>>
>>> or
>>>>     a
>>>>     { onlyField }
>>>>
>>>
>>> The current parsing will drop 'onlyField' and result with {}.
>>> It only 'pushes' the value into the container if there is a value.
>>>
>>> It was previously is a parsing error, no value for field... 'expected
>>> ':' and a value' sort of thing; But I ran into '{}' which is a similar
>>> parsing state...
>>>
>>>
>>>
>>>>     [ a(5), a("val1") ]
>>>> allowing an attacker to do
>>>>     <script>
>>>>     let onlyField = null;
>>>>     function a(...data) {
>>>>       alert(`Got ${ data }`);
>>>>     }
>>>>     </script>
>>>>     <script src="http://other-origin/jsox-web-service"></script>
>>>>
>>>> There's a lot of "ifs" in this scenario,
>>>> AND CORS solves a lot of these problems for origins that use it
>>>> AND browsers are less trusting of script srcs with
>>>> Content-types:text/x-jsox than they were in 2008
>>>> BUT
>>>>     // attacker setup
>>>>     let onlyField = null;
>>>>     function a(...data) {
>>>>       alert(`Got ${ data }`);
>>>>     }
>>>>     // victim responds
>>>>     a
>>>>     { onlyField }
>>>>     [ a(5), a("val1") ]
>>>> does alert twice in Chrome and JSON hijacking was exploited in the
>>>> wild, serializers have been known to
>>>> line wrap in attacker-controllable ways, and there may still be many
>>>> JSON webservices that respect ambient
>>>> credentials on cross-origin requests.
>>>>
>>>
>>> In the first case a(5) turns out to be a valid identifier, which is also
>>> sort of a string, and the second one would fault finding a " in the middle
>>> of a identifier... string-string is never allowed... "a""b"; but I see...
>>> it does depend on how parsing is implemented; grabbing the values with a
>>> regexp could do that.
>>>
>>>
>>>>
>>>>
>>>>
>>>>> This does stick to JSON's spirit of only transporting data.  The
>>>>> parser is very similar to a JSON parser, except many places that would
>>>>> previously throw are accepted....
>>>>> And references can only link to other objects/arrays within the
>>>>> current outermost object/array.
>>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>>> This already happens with plain JSON
>>>>>> <https://medium.com/@mikesamuel/protecting-against-object-forgery-2d0fd930a7a9>,
>>>>>> so anything that allows external inputs to specify which internal types to
>>>>>> construct would have to include a "Security Considerations" section that
>>>>>> explains how this could be safely used by code that assumes that `if (x
>>>>>> instanceof InternalType)` then x came from internal code that made a
>>>>>> good-faith effort to only pass appropriate inputs to `new
>>>>>> InternalType(...)`.
>>>>>>
>>>>>> On Tue, Sep 18, 2018 at 5:22 PM J Decker <d3ck0r at gmail.com> wrote:
>>>>>>
>>>>>>> (Thank you Rod Sterling)
>>>>>>>
>>>>>>> But seriously, I'd like to submit, for serious consideration, JSOX -
>>>>>>> JavaScript Object eXchange format.  It inherits all JSON syntax such that
>>>>>>> it is able to process any existing JSON.
>>>>>>>
>>>>>>> I'm, at this point, open to changing anything (or even omitting
>>>>>>> things), including the name.
>>>>>>>
>>>>>>> JSON is great.  JSON has some limits, and criticisms... JS/ES Grew ,
>>>>>>> but JSON has to stay the same, similarly with whatever comes next I'd
>>>>>>> imagine.
>>>>>>>
>>>>>>> So a primary goal is to encode and decode ES6 objects for transport
>>>>>>> with a simple API such as JSOX.parse( object ), and JSOX.stringify(
>>>>>>> jsoxString ).  But also keep with the simplicity of JSON,
>>>>>>> so it can be used in human readable circumstances.
>>>>>>>
>>>>>>> Types that are now (or soon) native to ES such as TypedArrays
>>>>>>> (binary data), BigInt types, and even the existing Date type, do not
>>>>>>> transport with JSON very well.  They become a non-identifable string, that
>>>>>>> requires extra code involving knowledge of the structure of the data being
>>>>>>> transferred to be able to restore the values to Date(), BigInt(), et al.
>>>>>>>
>>>>>>> So a few weeks ago I started considering what else, beyond these
>>>>>>> simple modifications might also be useful, or address criticisms of JSON.
>>>>>>> Handling the above types is really a trivial modification to most JSON
>>>>>>> parsers.  Each of the following modifications is really only a very slight
>>>>>>> change to behavior; although implementing typed-objects does initially
>>>>>>> involve changing error handling into identifer-fallback handling.
>>>>>>>
>>>>>>> I initially argued, that defining a object prototype
>>>>>>> 'card(name,address,zipcode,created)' which removes the redundant data for
>>>>>>> every following reference, (and is good, just for data reduction, which was
>>>>>>> argued 'gzip').  A JSON representation might be
>>>>>>> `{"name":"bob","address":"123
>>>>>>> street","zipcode":"55555","created":1537304820} where if have a large
>>>>>>> number of the same record the same 'name':,'address':, etc is repeated in
>>>>>>> every record.  Where a typed-object's value in JSOX could be
>>>>>>> `card{:"bob","123 street","55555",2018-09-18T21:07:00Z}`.  All objects that
>>>>>>> are revived as typed-objects share the same prototype, and before parsing,
>>>>>>> the prototypes to be used may be specified.  The amount of data to process
>>>>>>> is reduced, perhaps to a significant degree.
>>>>>>>
>>>>>>> So <Identifer> '{' is about typed-objects.  This construct is not
>>>>>>> allowed in JSON.  But that then leads to <Identifier> '['  - typed arrays,
>>>>>>> arrays don't really have redundant data potential like objects, but there
>>>>>>> are TypedArrays in ES.  There is no way to define a type of an array, but
>>>>>>> hardcoded types like 'ab', 'u8', 'ref' are used to revive binary data.  The
>>>>>>> bytes of the backing ArrayBuffer are encoded to base64, and included within
>>>>>>> '[' and ']' without quotes; using the brackets as quotes.
>>>>>>>
>>>>>>> A JSOX typed array is the 'ref' type.  A reference to another
>>>>>>> location in the current object can be specified, which allows encoding
>>>>>>> cyclic structures.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> https://github.com/d3x0r/jsox
>>>>>>> https://npmjs.com/package/jsox
>>>>>>>
>>>>>>> (Initial public reaction was not very helpful, but probably that's
>>>>>>> the fault of how it was introduced?)
>>>>>>>
>>>>>>> https://www.reddit.com/r/javascript/comments/9f8wml/jsox_javascript_object_exchange_format_preview/
>>>>>>>
>>>>>>> There was plenty of 'why not [YAML/BSON/protobufs/(I don't think
>>>>>>> anyone said XML)/...]'  and the answer is simply, because none of those
>>>>>>> read JSON, or have as simple of an API. (amongst other reasons that JSON is
>>>>>>> already a solution for compared to those mentioned)
>>>>>>> _______________________________________________
>>>>>>> es-discuss mailing list
>>>>>>> es-discuss at mozilla.org
>>>>>>> https://mail.mozilla.org/listinfo/es-discuss
>>>>>>>
>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20180920/3c40f7f5/attachment-0001.html>


More information about the es-discuss mailing list