First crack at a Streams proposal

Tab Atkins Jr. jackalmage at gmail.com
Mon Apr 15 19:19:32 PDT 2013


On Mon, Apr 15, 2013 at 5:26 PM, Kevin Gadd <kevin.gadd at gmail.com> wrote:
> OK, it sounds like this is sort of a 'lossy' stream concept, then, in that
> if you fail to call next() early enough, you can miss data from the stream?
> I can see how that would be acceptable for certain use cases. To me though,
> this limits use of the primitive.
>
> Here's a hypothetical. I'm building an instant messaging webapp - think
> google talk - and I want a 'stream' that represents the status of my online
> buddies. So each time someone goes online/offline, that would come through
> the stream.
>
> It sounds like with this design, were I to use next() to pull data out of
> the stream, any online status changes that have occurred before I called
> next() are lost forever, because at any given time all calls to next()
> return a future for the 'next update'. This means that if the portion of my
> service responsible from the contact list starts later than the actual
> network backend, it can miss status changes and end up with a partial
> picture of the state. Maybe that's ok.

Yes, my current model doesn't buffer changes before the first
listener.  This is the correct behavior for most DOM cases, but I see
how for a lot of other cases it's less good.  By default, you might
want to have that buffer, so the first consumer gets to see all the
history, at least.  (If you want multiple consumers to all get the
history, you should opt into that explicitly via some slightly less
convenient API, as it means infinite buffering.)

> Similarly, it sounds like given this model if I have 3 consumers, and they
> all call next() once to get values, they have to be absolutely certain to
> call next() again *as soon as* the Future from the previous next() gets
> data. If they do the 'right thing' that you normally do in Future-oriented
> scheduling, where a future being activated results in a task being scheduled
> to do work, it's possible to miss data, given a flow like this:

`next` is for when you just care about the very next time something
happens.  If you care about all the events that come out, use `listen`
instead.

It looks like many existing APIs use `next` for more pull-based
purposes, where the stream will buffer data until someone pulls it out
with `next`.  My current API is explicitly *not* pull-based, but it's
easy to imagine a subclass that is.

> Another question: Given this sort of 'synchronized' update model, what
> happens if two consumers both cause data to be pushed into the stream? There
> are two values to send out, but next() only has room for one value. Does the
> second value get thrown away? Does an exception get raised by the second
> push? I don't know how you can ensure that all the consumers will see the
> second value.

Same thing - consumers should be using `listen` if they want all the
data.  That's the "default" way to interact with a stream in my
proposal.

> I think I will have to echo others' thoughts here that this really doesn't
> seem like a 'Stream' API. It does not match the semantics of any Stream
> primitive/concept I have ever encountered in an API.

Yes, the name seems to be confusing people.  On the other hand, I find
a lot of the other existing names hopelessly opaque or awkward -
Signal doesn't mean much to me, and Observable is too abstract.
Stream is just the right kind of name to convey "a stream of updates",
but unfortunately the name is co-opted by binary IO streams. :/

~TJ


More information about the es-discuss mailing list