First crack at a Streams proposal

Kevin Gadd kevin.gadd at gmail.com
Mon Apr 15 17:26:35 PDT 2013


OK, it sounds like this is sort of a 'lossy' stream concept, then, in that
if you fail to call next() early enough, you can miss data from the stream?
I can see how that would be acceptable for certain use cases. To me though,
this limits use of the primitive.

Here's a hypothetical. I'm building an instant messaging webapp - think
google talk - and I want a 'stream' that represents the status of my online
buddies. So each time someone goes online/offline, that would come through
the stream.

It sounds like with this design, were I to use next() to pull data out of
the stream, any online status changes that have occurred before I called
next() are lost forever, because at any given time all calls to next()
return a future for the 'next update'. This means that if the portion of my
service responsible from the contact list starts later than the actual
network backend, it can miss status changes and end up with a partial
picture of the state. Maybe that's ok.

Similarly, it sounds like given this model if I have 3 consumers, and they
all call next() once to get values, they have to be absolutely certain to
call next() again *as soon as* the Future from the previous next() gets
data. If they do the 'right thing' that you normally do in Future-oriented
scheduling, where a future being activated results in a task being
scheduled to do work, it's possible to miss data, given a flow like this:

1. The consumers call next() and get futures (we'll call these A)
2. Data A is pushed into the stream.
3. Futures A are all fulfilled with data A. The handlers responsible for
responding to it push work items onto the message queue (or event pump, or
setTimeout, or whatever).
4. Data B is pushed into the stream. There are no next listeners at this
point.
5. The work items on the message queue run, respond to Data A, and then
call next() to get futures (we'll call these futures B)
6. Data C is pushed into the stream.
7. Futures B are fulfilled with data C.

In this scenario, I don't think data B would ever be observed using next(),
which feels like a real trap for the unwary. You would normally solve this
with some sort of buffering, like you see in sockets and many threading
setups, in order to handle cases where a consumer is not able to respond
instantaneously to the flow of data.

Is this just an incorrect use of 'next' - are you supposed to
instantaneously respond to your Future from next() completing by calling
next() again? In many Future models this is inadvisable because you can end
up with recursive completion of futures when the stream is full - each call
to next() returns a future that is already complete, so as soon as you
attach a callback to the future, that callback completes and you climb down
the stack, recursively processing data until you run out of stack space. I
can see how perhaps the implicit assumption here is that the stack overflow
scenario is prevented by utilizing the event loop/message pump to fulfill
futures, but honestly I think any design which depends on that is a rather
questionable design.

Another question: Given this sort of 'synchronized' update model, what
happens if two consumers both cause data to be pushed into the stream?
There are two values to send out, but next() only has room for one value.
Does the second value get thrown away? Does an exception get raised by the
second push? I don't know how you can ensure that all the consumers will
see the second value.

I think I will have to echo others' thoughts here that this really doesn't
seem like a 'Stream' API. It does not match the semantics of any Stream
primitive/concept I have ever encountered in an API.



On Mon, Apr 15, 2013 at 5:14 PM, Tab Atkins Jr. <jackalmage at gmail.com>wrote:

> On Mon, Apr 15, 2013 at 5:06 PM, Kevin Gadd <kevin.gadd at gmail.com> wrote:
> > If this is really about multiple-consumer streams, the semantics of this
> > proposed API are incredibly murky to me. What happens if each consumer
> calls
> > next()? Do they all get the same value out of their Future when it's
> > completed? Do they each randomly get one of the values pushed into the
> > stream? Is the stream implicitly required to buffer data in order to be
> able
> > to offer it at a slower rate to one consumer than it is offered to the
> other
> > consumers? Does each consumer have a different 'view' of the state of the
> > stream (i.e. its cancelled/ended state, when those concepts apply)?
>
> Huh, I'm not sure what's unclear about it.  (Though, obviously it must be.)
>
> The future returned by Stream#next() resolves at the next update (or
> when the stream completes/rejects).  If multiple consumers call next()
> repeatedly in the same tick (or in different ticks, but before an
> update gets pushed), all of the futures resolve at the same time,
> because they're all listening for the same "next update".
>
> I'm not sure I understand how it could be required to buffer.  Can you
> describe the kind of situation you think would cause that need?
>
> Same with view - perhaps you're thinking that the state can be updated
> syncly, inbetween listen callbacks?
>
> ~TJ
>



-- 
-kg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/es-discuss/attachments/20130415/847bf18f/attachment.html>


More information about the es-discuss mailing list