Binary Data - possible topic for joint session

Maciej Stachowiak mjs at apple.com
Sat Nov 7 19:17:48 PST 2009


On Nov 7, 2009, at 6:53 PM, Ash Berlin wrote:

>
> On 8 Nov 2009, at 02:21, Maciej Stachowiak wrote:
>
>>
>> On Nov 7, 2009, at 5:39 AM, Ash Berlin wrote:
>>
>>> On 6 Nov 2009, at 19:24, Brendan Eich wrote:
>>>
>>>> On Nov 6, 2009, at 10:44 AM, Dean Landolt wrote:
>>>>
>>>>> http://wiki.commonjs.org/wiki/Binary
>>>>
>>>> [snip]
>>>
>>> [snip]
>>> As a community (CommonJS) we'd be more than happy to go forward  
>>> with a binary spec that came from (or at least has the blessing  
>>> of) the ES groups
>>
>> Binary/B is the closest of the three proposals to mine, in that it  
>> has both mutable and immutable binary data containers. Here are a  
>> few key differences:
>>
>> (1) Binary/B does not have a cheap way to convert from the  
>> immutable representation (ByteString) to the mutable representation  
>> (ByteArray)
>> (2) In Binary/B, Array-like index access to ByteString gives back  
>> one-byte ByteStrings instead of bytes, likely an over-literal  
>> copying of String
>> (3) There are some seemingly needless differences in the interfaces  
>> to ByteString and ByteArray that follow from modeling on String and  
>> Array
>> (4) Binary/B has many more operations available in the base  
>> proposal (including charset transcoding and a generous selection of  
>> String and Array methods)
>> (5) Different names - Data/DataBuilder vs. ByteString/ByteArray
>
>
>> On (1): cheap conversion from mutable to immutable  
>> (DataBuilder.prototype.release() in my proposal) lets binary data  
>> objects be built up with a convenient mutation-based idiom, but  
>> then passed around as immutable objects thereafter.
>
> Mutable to immutable or immutable to mutable? Assuming the former,  
> how do you handle the differences in API/behaviour? each function  
> checks wether it is now immutable?

Mutable to immutable. Immutable to mutable has to copy (or at least  
copy-on-write).

My proposal does it like this (where DataBuilder is the mutable  
variant and Data is the immutable):

DataBuilder.prototype.release()

     Return a new Data with the same length and the same byte values  
as the DataBuilder passed as the this value. At the same time, the  
DataBuilder is reset to length 0.

Because the DataBuilder is reset to empty, the implementation can  
"steal" its underlying buffer for the new Data object, thus converting  
to immutable without a full copy. This matches the common pattern of  
assembling a new piece of binary data with mutation, then handing it  
out to possibly multiple other pieces of code as immutable.


>
>> On (2): I don't think a one-byte ByteString is ever useful,  
>> indexing to get the byte value would be much more helpful.
>
> Couldn't agree more with you here - if for whatever reason you do  
> want a one-byte ByteString, there is always substr/substring. This  
> is something that came up recently in IRC and prompted me to start  
> looking at making changes to the proposal - I was going to do that  
> next week, so this coming up now is very good timing.
>
>> On (3), I think it's good for the mutable interface to be a strict  
>> superset of the the immutable interface.
>
> Seems like a reasonable thing to do.

I'm glad we agree on these two points.

>
>>
>> (4) and (5) are all points where perhaps neither proposal is at the  
>> optimum yet. On (4), I suspect the sweet spot is somewhere between  
>> my spartan set of built-in operations and the very generous set in  
>> Binary/B.
>
> Agreed - this was the other thing i noticed - e.g. sorting a  
> ByteArray isn't really an operation that makes a whole lot of sense  
> to my mind.

Yep. I'm not even sure things like map(), filter() or reduce() are  
likely to work well. My own preference is to start the API very small,  
and add incrementally based on demonstrated need and clearly  
articulated use cases.

>
>
>> On (5), I'm not sure either set of names is the best possible, and  
>> I'm certainly not stuck on my own proposed names.
>
> I'm not really bothered either way on this front, although 'Data' is  
> much more likely to clash with existing code.

Yes, Brendan made this point and presented some good evidence in that  
direction. I think 'Data' doesn't work but 'Binary' or 'BinData' might.

>
> Something worth bearing in mind is that Binary/B is implemented in 2  
> or 3 CommonJS platforms already, but I don't think any one is  
> particularly attached to the behaviour so long as what comes out  
> isn't wildly different.

What kind of differences do you think they would tolerate? Renaming  
the classes? Dropping/changing some methods?

Regards,
Maciej


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20091107/2df892ee/attachment-0001.html>


More information about the es-discuss mailing list