First crack at a Streams proposal

Jake Verbaten raynos2 at gmail.com
Mon Apr 15 17:34:58 PDT 2013


So one thing that is confusing is that `next()` has an API that looks like
the stream is a "pull stream". A "pull stream" means that you call next()
and it will give you the next value when it has it. This type of stream
generally buffers values internally and only gives you them when you pull
them out by calling next(). Node's streams2 API has a similar thing with a
read() API.

The semantics of the stream are actually push, which means the stream will
just push values out whenever it has them and it's the consumers
responsibility to add listeners using `listen()`. The usage of `next()` is
confusing as it's easy to mistake the stream for being pull based.


On Mon, Apr 15, 2013 at 5:26 PM, Kevin Gadd <kevin.gadd at gmail.com> wrote:

> OK, it sounds like this is sort of a 'lossy' stream concept, then, in that
> if you fail to call next() early enough, you can miss data from the stream?
> I can see how that would be acceptable for certain use cases. To me though,
> this limits use of the primitive.
>
> Here's a hypothetical. I'm building an instant messaging webapp - think
> google talk - and I want a 'stream' that represents the status of my online
> buddies. So each time someone goes online/offline, that would come through
> the stream.
>
> It sounds like with this design, were I to use next() to pull data out of
> the stream, any online status changes that have occurred before I called
> next() are lost forever, because at any given time all calls to next()
> return a future for the 'next update'. This means that if the portion of my
> service responsible from the contact list starts later than the actual
> network backend, it can miss status changes and end up with a partial
> picture of the state. Maybe that's ok.
>
> Similarly, it sounds like given this model if I have 3 consumers, and they
> all call next() once to get values, they have to be absolutely certain to
> call next() again *as soon as* the Future from the previous next() gets
> data. If they do the 'right thing' that you normally do in Future-oriented
> scheduling, where a future being activated results in a task being
> scheduled to do work, it's possible to miss data, given a flow like this:
>
> 1. The consumers call next() and get futures (we'll call these A)
> 2. Data A is pushed into the stream.
> 3. Futures A are all fulfilled with data A. The handlers responsible for
> responding to it push work items onto the message queue (or event pump, or
> setTimeout, or whatever).
> 4. Data B is pushed into the stream. There are no next listeners at this
> point.
> 5. The work items on the message queue run, respond to Data A, and then
> call next() to get futures (we'll call these futures B)
> 6. Data C is pushed into the stream.
> 7. Futures B are fulfilled with data C.
>
> In this scenario, I don't think data B would ever be observed using
> next(), which feels like a real trap for the unwary. You would normally
> solve this with some sort of buffering, like you see in sockets and many
> threading setups, in order to handle cases where a consumer is not able to
> respond instantaneously to the flow of data.
>
> Is this just an incorrect use of 'next' - are you supposed to
> instantaneously respond to your Future from next() completing by calling
> next() again? In many Future models this is inadvisable because you can end
> up with recursive completion of futures when the stream is full - each call
> to next() returns a future that is already complete, so as soon as you
> attach a callback to the future, that callback completes and you climb down
> the stack, recursively processing data until you run out of stack space. I
> can see how perhaps the implicit assumption here is that the stack overflow
> scenario is prevented by utilizing the event loop/message pump to fulfill
> futures, but honestly I think any design which depends on that is a rather
> questionable design.
>
> Another question: Given this sort of 'synchronized' update model, what
> happens if two consumers both cause data to be pushed into the stream?
> There are two values to send out, but next() only has room for one value.
> Does the second value get thrown away? Does an exception get raised by the
> second push? I don't know how you can ensure that all the consumers will
> see the second value.
>
> I think I will have to echo others' thoughts here that this really doesn't
> seem like a 'Stream' API. It does not match the semantics of any Stream
> primitive/concept I have ever encountered in an API.
>
>
>
> On Mon, Apr 15, 2013 at 5:14 PM, Tab Atkins Jr. <jackalmage at gmail.com>wrote:
>
>> On Mon, Apr 15, 2013 at 5:06 PM, Kevin Gadd <kevin.gadd at gmail.com> wrote:
>> > If this is really about multiple-consumer streams, the semantics of this
>> > proposed API are incredibly murky to me. What happens if each consumer
>> calls
>> > next()? Do they all get the same value out of their Future when it's
>> > completed? Do they each randomly get one of the values pushed into the
>> > stream? Is the stream implicitly required to buffer data in order to be
>> able
>> > to offer it at a slower rate to one consumer than it is offered to the
>> other
>> > consumers? Does each consumer have a different 'view' of the state of
>> the
>> > stream (i.e. its cancelled/ended state, when those concepts apply)?
>>
>> Huh, I'm not sure what's unclear about it.  (Though, obviously it must
>> be.)
>>
>> The future returned by Stream#next() resolves at the next update (or
>> when the stream completes/rejects).  If multiple consumers call next()
>> repeatedly in the same tick (or in different ticks, but before an
>> update gets pushed), all of the futures resolve at the same time,
>> because they're all listening for the same "next update".
>>
>> I'm not sure I understand how it could be required to buffer.  Can you
>> describe the kind of situation you think would cause that need?
>>
>> Same with view - perhaps you're thinking that the state can be updated
>> syncly, inbetween listen callbacks?
>>
>> ~TJ
>>
>
>
>
> --
> -kg
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20130415/f743b70a/attachment-0001.html>


More information about the es-discuss mailing list