First crack at a Streams proposal

Andrea Giammarchi andrea.giammarchi at gmail.com
Mon Apr 15 17:37:22 PDT 2013


how about .pull() then, to read, and .push() to write :D


On Mon, Apr 15, 2013 at 5:34 PM, Jake Verbaten <raynos2 at gmail.com> wrote:

> So one thing that is confusing is that `next()` has an API that looks like
> the stream is a "pull stream". A "pull stream" means that you call next()
> and it will give you the next value when it has it. This type of stream
> generally buffers values internally and only gives you them when you pull
> them out by calling next(). Node's streams2 API has a similar thing with a
> read() API.
>
> The semantics of the stream are actually push, which means the stream will
> just push values out whenever it has them and it's the consumers
> responsibility to add listeners using `listen()`. The usage of `next()` is
> confusing as it's easy to mistake the stream for being pull based.
>
>
> On Mon, Apr 15, 2013 at 5:26 PM, Kevin Gadd <kevin.gadd at gmail.com> wrote:
>
>> OK, it sounds like this is sort of a 'lossy' stream concept, then, in
>> that if you fail to call next() early enough, you can miss data from the
>> stream? I can see how that would be acceptable for certain use cases. To me
>> though, this limits use of the primitive.
>>
>> Here's a hypothetical. I'm building an instant messaging webapp - think
>> google talk - and I want a 'stream' that represents the status of my online
>> buddies. So each time someone goes online/offline, that would come through
>> the stream.
>>
>> It sounds like with this design, were I to use next() to pull data out of
>> the stream, any online status changes that have occurred before I called
>> next() are lost forever, because at any given time all calls to next()
>> return a future for the 'next update'. This means that if the portion of my
>> service responsible from the contact list starts later than the actual
>> network backend, it can miss status changes and end up with a partial
>> picture of the state. Maybe that's ok.
>>
>> Similarly, it sounds like given this model if I have 3 consumers, and
>> they all call next() once to get values, they have to be absolutely certain
>> to call next() again *as soon as* the Future from the previous next() gets
>> data. If they do the 'right thing' that you normally do in Future-oriented
>> scheduling, where a future being activated results in a task being
>> scheduled to do work, it's possible to miss data, given a flow like this:
>>
>> 1. The consumers call next() and get futures (we'll call these A)
>> 2. Data A is pushed into the stream.
>> 3. Futures A are all fulfilled with data A. The handlers responsible for
>> responding to it push work items onto the message queue (or event pump, or
>> setTimeout, or whatever).
>> 4. Data B is pushed into the stream. There are no next listeners at this
>> point.
>> 5. The work items on the message queue run, respond to Data A, and then
>> call next() to get futures (we'll call these futures B)
>> 6. Data C is pushed into the stream.
>> 7. Futures B are fulfilled with data C.
>>
>> In this scenario, I don't think data B would ever be observed using
>> next(), which feels like a real trap for the unwary. You would normally
>> solve this with some sort of buffering, like you see in sockets and many
>> threading setups, in order to handle cases where a consumer is not able to
>> respond instantaneously to the flow of data.
>>
>> Is this just an incorrect use of 'next' - are you supposed to
>> instantaneously respond to your Future from next() completing by calling
>> next() again? In many Future models this is inadvisable because you can end
>> up with recursive completion of futures when the stream is full - each call
>> to next() returns a future that is already complete, so as soon as you
>> attach a callback to the future, that callback completes and you climb down
>> the stack, recursively processing data until you run out of stack space. I
>> can see how perhaps the implicit assumption here is that the stack overflow
>> scenario is prevented by utilizing the event loop/message pump to fulfill
>> futures, but honestly I think any design which depends on that is a rather
>> questionable design.
>>
>> Another question: Given this sort of 'synchronized' update model, what
>> happens if two consumers both cause data to be pushed into the stream?
>> There are two values to send out, but next() only has room for one value.
>> Does the second value get thrown away? Does an exception get raised by the
>> second push? I don't know how you can ensure that all the consumers will
>> see the second value.
>>
>> I think I will have to echo others' thoughts here that this really
>> doesn't seem like a 'Stream' API. It does not match the semantics of any
>> Stream primitive/concept I have ever encountered in an API.
>>
>>
>>
>> On Mon, Apr 15, 2013 at 5:14 PM, Tab Atkins Jr. <jackalmage at gmail.com>wrote:
>>
>>> On Mon, Apr 15, 2013 at 5:06 PM, Kevin Gadd <kevin.gadd at gmail.com>
>>> wrote:
>>> > If this is really about multiple-consumer streams, the semantics of
>>> this
>>> > proposed API are incredibly murky to me. What happens if each consumer
>>> calls
>>> > next()? Do they all get the same value out of their Future when it's
>>> > completed? Do they each randomly get one of the values pushed into the
>>> > stream? Is the stream implicitly required to buffer data in order to
>>> be able
>>> > to offer it at a slower rate to one consumer than it is offered to the
>>> other
>>> > consumers? Does each consumer have a different 'view' of the state of
>>> the
>>> > stream (i.e. its cancelled/ended state, when those concepts apply)?
>>>
>>> Huh, I'm not sure what's unclear about it.  (Though, obviously it must
>>> be.)
>>>
>>> The future returned by Stream#next() resolves at the next update (or
>>> when the stream completes/rejects).  If multiple consumers call next()
>>> repeatedly in the same tick (or in different ticks, but before an
>>> update gets pushed), all of the futures resolve at the same time,
>>> because they're all listening for the same "next update".
>>>
>>> I'm not sure I understand how it could be required to buffer.  Can you
>>> describe the kind of situation you think would cause that need?
>>>
>>> Same with view - perhaps you're thinking that the state can be updated
>>> syncly, inbetween listen callbacks?
>>>
>>> ~TJ
>>>
>>
>>
>>
>> --
>> -kg
>>
>
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20130415/d85ec657/attachment.html>


More information about the es-discuss mailing list