Feedback on Binary Data updates

Luke Hoban lukeh at microsoft.com
Wed Jul 20 08:17:09 PDT 2011


I saw that there have been some updates to the Binary Data proposal on the wiki.  This is great, I think Binary Data is one of the really important enabling capabilities being added to ES.next, so I'm excited to see progress here.

Below is some feedback on the current draft proposal.  The overarching theme of the feedback is the alignment with Typed Arrays.  I would still really like to see Binary Data in ES.next subsume (through being a sufficiently compatible superset) Typed Arrays as currently defined and implemented in several browsers and Web APIs, to provide a single consistent and complete binary array model for the web platform.  This seems to be within reach.  I see that dherman is now an editor on the Typed Arrays spec as well, so I'm hoping that I'll hear that progress is already being made on this :-).  I'd also be happy to help with this.


NOTES:

* Array types of fixed length
The current design fixes the length of an ArrayType instance as part of the ArrayType definition, instead of as a parameter to the resulting constructor.  I'm not sure I understand the motivation for that.  It seems more flexible, simpler for users in the common case, and more interoperable with TypedArrays to have the array length be on a per array instance basis, and passed to the array constructor object on allocation.

That is, I assumed the intended usage was:

var MyStruct = new StructType({x : uint8, y: uint8})
var MyArray = new ArrayType(MyStruct);
var myArray1 = new MyArray (16);
var myArray2 = new MyArray (32);

But in the current design it seems I have to create another constructor object to accomplish this:

var MyStruct = new StructType({x : uint8, y: uint8})
var MyArray1 = new ArrayType(MyStruct, 16);
var MyArray2 = new ArrayType(MyStruct, 32);
var myArray1 = new MyArray1();
var myArray2 = new MyArray2();

This doesn't feel like the right split between the meta levels, and makes the code more complex.  Notably, I would have also expected that the "Uint8Array" constructor object defined in Typed Arrays could be defined by Binary Data directly, as:

var UInt8Array = new ArrayType(uint8);

This is less natural with the current model for array length.

* Compatibility with Typed Arrays array objects
There are a few divergences between Binary Data arrays and Typed Array arrays, that look like they could be addressed:
 - The constructor difference mentioned above, including support for copy constructors.
 - Lack of buffer, byteLength, byteOffset, BYTES_PER_ELEMENT.   I see these are noted in TODO.
 - array.set(otherArr, offset) support on the Binary Data arrays
 - Conversions, see below
 - Different prototype chains, additional members like elementType on binary data arrays.  

The last item is one of the reasons why it would be nice to pull the Typed Arrays objects into Binary Data, so that they could be augment to be fully consistent - for example, to expose the elementType.


* Conversions
The rules for conversions of argument values into the primitive value types seem to be different than typical ES conversions and those used by TypedArrays via WebIDL.  Why not use ToInt32 and friends for conversion?  Current rules appear to be quite strict - throwing on most type mismatches, and also more permissive for some unexpected cases like "0x"-prefixed strings.  


* DataView integration with structs
DataView is an important piece of Typed Arrays for reading from heterogenous binary data sources like files and network protocols, and for controlling endianness of data reads.  DataView would seem to benefit from structs, and structs would benefit from DataView.  This is another reason to want to spec DataView itself in ES.next.  I imagine an additional pair of functions on DataView akin to the following would allow nice interop between DataView and Binary Data "Types"/"Data":

    Data getData(Type type, unsigned long byteOffset, optional boolean littleEndian);
    void setData(Type type, unsigned long byteOffset, Data value, optional boolean littleEndian);


* Explicit inclusion of Uint32Array  and similar objects
The Uint32Array and similar objects defined in Type Arrays are the ones that are likely to be the most commonly used in many/most use cases, but these are missing from the ES.next proposal.  Including them in the ES.next proposal explicitly, as supersets of the Typed Arrays objects, would avoid users having to manually create them, and help ensure full API consistency. 


* A lot of meta- objects
The spec defines 14 objects, without yet defining any of the 10 typed arrays objects.  Several of the objects only serve as scaffolding for the meta-hierarchy, and don't appear to be objects which users are expected to frequently (or ever) work with.  Are the named "Type" and "Data" objects needed in the proposal? 


* Naming
The term "Type" feels somewhat too generic for referring to struct shapes.  The previous "block" terminology actually sounded more natural, or at least more scoped.


Luke


More information about the es-discuss mailing list