Suggestion: Add Standard IO Streams
Isiah Meadows
isiahmeadows at gmail.com
Mon Apr 30 04:23:21 UTC 2018
I'll make one big recommendation: before you try to formulate a
solution yourself, please do check this out:
https://github.com/kriskowal/gtor
You really have to understand what the status quo is (I presume you
do) and why the status quo is the way it is (this is much less common)
before you can realistically come up with a decent primitive.
----
To hit on the other points:
1. Although it might not look it, the proposed `Observable` [1] is
synchronous in nature. It's not the most obvious or intuitive, but
there are reasons why it needs not only sync `send`, but also sync
`error` and `complete`. So you can make do with a purely synchronous
stream - users can always defer work via `Promise.resolve().then(() =>
{ ... })`.
2. If you wanted to introduce a backpressure mechanism, you'd have to
create a means of also automatically batching data, or it'd be
worthless on its own (you could just use an array and push/shift from
it instead). Node has this builtin to its streams (and it emits
asynchronously), but RxJS expects the consumer to do its own
backpressure management, and does *not* provide a means of pushing
multiple values in a single batch. This is hard to do right, but I
think this could be done as an `Observable` subclass that operates on
binary batched data, sending after the next microtask (or on a method
call) all previously written bytes. In this case, a native
implementation here would be *far* better than what a user-level
implementation could provide, since it can do a lot to avoid
allocation and complex memory access that a user-level implementation
can't really.
3. Streams are typically in one of two forms: lazy (which async
iterators can already handle) and eager (which are well-suited for
observables). Streams typically conflate these two, but async
iterators and observables separate them. Note that separating them
makes it simpler and more efficient, since with async iterators, you
don't need to allocate a buffer except for when adapting observables,
and for observables, you don't need to wait for a request before
scheduling a read. (With both conflated, you need to both have a
buffer and a boolean to know whether to eagerly read or not.)
4. I would find that operating on binary data is of *higher* priority
than operating on string data. About the only thing needed here for
string data is a means of translating a string between binary
encodings (like `ArrayBuffer.from('some string', encoding?)` +
`buf.decode(encoding?, start = 0, end = buf.byteLength)` + streaming
equivalents of both, requiring support for ASCII, UTF-8, UTF-16BE, and
UTF-16LE, but offering hooks for other encodings like Base64). It's
*possible* to do this from userland, but engines don't need to go
through *nearly* the overhead to convert them, especially for UTF-8,
UTF-16 (JS's native encoding), and ASCII (optimized for in some
engines).
[1]: https://github.com/tc39/proposal-observable
[2]: https://docs.google.com/document/d/1Zo-ls61hNviKl-hfLSoR3NeGoIA-Z0cBvQRof5YEzVw/edit
-----
Isiah Meadows
me at isiahmeadows.com
Looking for web consulting? Or a new website?
Send me an email and we can get started.
www.isiahmeadows.com
On Sun, Apr 29, 2018 at 8:22 PM, <chetmichals at gmail.com> wrote:
> Yeah, thinking of it a bit more carefully, we would need to define a
> Standard Stream Interface in order to have the 3 Standard IO Streams. I can
> start putting together some research and a sample interface to present my
> ideas in more depth, but it'll take some time to flesh everything out. But
> we can still start hashing out some ideas in the meantime, and making sure
> concerns are addressed.
>
>
>
> I believe we should define both an synchronous and asynchronous API day one,
> where the "normal" reads would be blocking and return the value, if the
> "normal" writes are blocking or not would be host environment specific, and
> the asynchronous versions of both would return promises. So if there is a
> read/write, there would also be a readAsync/writeAsync. Another option
> instead of doubling up all the functions for a single Stream Object, is to
> define both a Stream and Async Stream Object, and have a method to get the
> other kind from in both. So a normal synchronous Stream could have an
> Async() method to get the Async Stream version, and the Async version can
> have a Sync() method to get the normal Stream version.
>
>
>
> The easy out for how to deal with queueing and back pressure is that it is
> implementation details of the host environment. Ideally a user would be able
> to use both the synchronous and asynchronous functions on a Stream, even if
> the Host Environment only actually supports one kind. But I don't think it’s
> our place to solve for a user trying to write 100GB of data asynchronously
> or read a 100GB of data synchronously. I believe what we can do is define a
> Standard Exception that can be thrown if the buffers overflow, let the host
> environment define how large their buffers are, and maybe give them the
> option of throwing out data if desired (Maybe for some hypothetical host
> environment in the future, the Standard Error's Buffer overflowing and
> losing data is acceptable behavior and does not need to raise an exception)
>
>
>
> One thing to note, just because the language defines standard Streams, does
> not mean we necessarily have to define how a user can create their own
> Streams at this time. The day one version could make Streams only something
> setup by host environment. Host Environments that don't want to support a
> standard in could have an object that just throws an exception if a user
> tries to output any data, and those that don't want to support a standard
> out/error could have the functions stubbed out.
>
>
>
> So, some things that come to mind also, should these Streams only support
> text data, or should they be usable for binary data too? And should there be
> a single Base Stream class, or should the Input Stream and Output Stream be
> different things (If they are a single base class, a function could be added
> to allow for code to check the Stream Type).
>
>
>
> I think for the first passes, generators and async iteration don’t need to
> be considered, and can be taken into account for once it starts taking
> shape.
>
>
>
> From: Michael J. Ryan <tracker1 at gmail.com>
> Sent: Sunday, April 29, 2018 3:47 PM
> To: Chet Michals <chetmichals at gmail.com>
> Cc: es-discuss <es-discuss at mozilla.org>
> Subject: Re: Suggestion: Add Standard IO Streams
>
>
>
> Why not create an npm module that represents what you'd like to see as an
> interface wrapping node's implementation and propose your new interface?
>
>
>
> This way you can try building something with it. You'll first need to
> implement how you see a synchronous stream in the first place.
>
>
>
> Streams can be a complex beast though. Do you want a synchronous or
> asynchronous implementation or both? Will they consume like generators,
> async iteration (for await of)? What about queueing and back pressure?
>
>
>
> On Sat, Apr 28, 2018, 22:28 Chet Michals <chetmichals at gmail.com> wrote:
>
> Working between a number of different ECMAScript environments over the
> years, such as in Java with Rhino and Nashorn, Node.js, and various web
> browsers, one thing I have noticed is that there is no standard
> input/output/error stream like most other languages support, and each
> ecosystem tends to define their own host objects to deal with these, like
> the console object in most Web Browsers (Which is at least defined in a
> WHATWG Living Standard), the process object in Node.js, and the print object
> in Nashorn.
>
>
>
> I feel for long term portability, the 3 standard IO streams should be added
> to the spec in some way,
>
>
>
> Is there a reason I am not seeing as to why this wouldn't be desired?
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
>
> _______________________________________________
> es-discuss mailing list
> es-discuss at mozilla.org
> https://mail.mozilla.org/listinfo/es-discuss
>
More information about the es-discuss
mailing list