Events vs Streams vs Observables vs Async Iterators
Asked Answered
B

2

64

Currently, the only stable way to process a series of async results in JavaScript is using the event system. However, three alternatives are being developed:

Streams: https://streams.spec.whatwg.org
Observables: https://tc39.github.io/proposal-observable
Async Iterators: https://tc39.github.io/proposal-async-iteration

What are the differences and benefits of each over events and the others?

Do any of these intend to replace events?

Bedridden answered 11/9, 2016 at 19:6 Comment(2)
Btw, take a closer look into this articel: A General Theory of ReactivityGoldeneye
One can hardly imagine a better example of a fascinating, useful question, which nevertheless according to SO's ridiculous, tight-sphinctered rules should be closed as "too broad" or "matter of opinion".Undirected
C
156

There are roughly two categories of APIs here: pull and push.

Pull

Async pull APIs are a good fit for cases where data is pulled from a source. This source might be a file, or a network socket, or a directory listing, or anything else. The key is that work is done to pull or generate data from the source when asked.

Async iterators are the base primitive here, meant to be a generic manifestation of the concept of a pull-based async source. In such a source, you:

  • Pull from an async iterator by doing const promise = ai.next()
  • Wait for the result using const result = await promise (or using .then())
  • Inspect the result to find out if it's an exception (thrown), an intermediate value ({ value, done: false }), or a done signal ({ value: undefined, done: true }).

This is similar to how sync iterators are a generic manifestation of the concept of a pull-based sync value source. The steps for a sync iterator are exactly the same as the above, omitting the "wait for the result" step.

Readable streams are a special case of async iterators, meant to specifically encapsulate I/O sources like sockets/files/etc. They have specialized APIs for piping them to writable streams (representing the other half of the I/O ecosystem, sinks) and handling the resulting backpressure. They also can be specialized to handle bytes in an efficient "bring your own buffer" manner. This is all somewhat reminiscent of how arrays are a special case of sync iterators, optimized for O(1) indexed access.

Another feature of pull APIs is that they are generally single-consumer. Whoever pulls the value, now has it, and it doesn't exist in the source async iterator/stream/etc. anymore. It's been pulled away by the consumer.

In general, pull APIs provide an interface for communicating with some underlying source of data, allowing the consumer to express interest in it. This is in contrast to...

Push

Push APIs are a good fit for when something is generating data, and the data being generated does not care about whether anyone wants it or not. For example, no matter whether someone is interested, it's still true that your mouse moved, and then you clicked somewhere. You'd want to manifest those facts with a push API. Then, consumers---possibly multiple of them---may subscribe, to be pushed notifications about such things happening.

The API itself doesn't care whether zero, one, or many consumers subscribe. It's just manifesting a fact about things that happened into the universe.

Events are a simple manifestation of this. You can subscribe to an EventTarget in the browser, or EventEmitter in Node.js, and get notified of events that are dispatched. (Usually, but not always, by the EventTarget's creator.)

Observables are a more refined version of EventTarget. Their primary innovation is that the subscription itself is represented by a first-class object, the Observable, which you can then apply combinators (such as filter, map, etc.) over. They also make the choice to bundle together three signals (conventionally named next, complete, and error) into one, and give these signals special semantics so that the combinators respect them. This is as opposed to EventTarget, where event names have no special semantics (no method of EventTarget cares whether your event is named "complete" vs. "asdf"). EventEmitter in Node has some version of this special-semantics approach where "error" events can crash the process, but that's rather primitive.

Another nice feature of observables over events is that generally only the creator of the observable can cause it to generate those next/error/complete signals. Whereas on EventTarget, anyone can call dispatchEvent(). This separation of responsibilities makes for better code, in my experience.

But in the end, both events and observables are good APIs for pushing occurrences out into the world, to subscribers who can tune in and tune out at any time. I'd say observables are the more modern way to do this, and nicer in some ways, but events are more widespread and well-understood. So if anything was intended to replace events, it'd be observables.

Push <-> pull

It's worth noting that you can build either approach on top of the other in a pinch:

  • To build push on top of pull, constantly be pulling from the pull API, and then push out the chunks to any consumers.
  • To build pull on top of push, subscribe to the push API immediately, create a buffer that accumulates all results, and when someone pulls, grab it from that buffer. (Or wait until the buffer becomes non-empty, if your consumer is pulling faster than the wrapped push API is pushing.)

The latter is generally much more code to write than the former.

Another aspect of trying to adapt between the two is that only pull APIs can easily communicate backpressure. You can add a side-channel to push APIs to allow them to communicate backpressure back to the source; I think Dart does this, and some people try to create evolutions of observables that have this ability. But it's IMO much more awkward than just properly choosing a pull API in the first place. The flip side of this is that if you use a push API to expose a fundamentally pull-based source, you will not be able to communicate backpressure. This is the mistake made with the WebSocket and XMLHttpRequest APIs, by the way.

In general I find attempts to unify everything into one API by wrapping others misguided. Push and pull have distinct, not-very-overlapping areas where they each work well, and saying that we should pick one of the four APIs you mentioned and stick with it, as some people do, is shortsighted and leads to awkward code.

Chema answered 10/11, 2017 at 1:52 Comment(11)
Could you elaborate on what you mean by back-pressure?Renick
@Chema "This is the mistake made with the XMLHttpRequest APIs, by the way", could you describe it more detailed, thanks!Shabby
How is XMLHttpRequest a push API?Rm
Because it uses events to push data at you, instead of waiting for you to read a chunk of data. It thus has no concept of backpressure since it has no idea how fast you are consuming the data.Chema
Excellent answer Domenic - you might want to add some examples from gtor or a similar resource for pull/push examples. It's worth mentioning for future readers that Node intends to interop with async iterators (but not observables) at the moment - as those are much further in the spec.Ascospore
@Chema I know that this post is a quite old. But I hesitate when I read "only pull APIs can easily communicate backpressure." Does the reactive streams answer this issue by providing non-blocking backpressure and still complying with an idiomatic push API? Doesn't it?Guddle
In my opinion, no. As I said in the post: "You can add a side-channel to push APIs to allow them to communicate backpressure back to the source; I think Dart does this, and some people try to create evolutions of observables that have this ability. But it's IMO much more awkward than just properly choosing a pull API in the first place."Chema
Write streams use separate events to communicate back-pressure in NodeJS, which is an example of using a side-channel. Observables don't have any back-pressure mechanism, they just have operators that let you keep from being swamped by filtering based on rates or waiting for another observable to emit to signal it is ready, but that's another example of a side-channel.Skull
That is getting so confusing, Obervables API is very much similar to iterator-helpers, and to proposal-emitter, not to mention node/userland streams. And essentially they're doing the same thing.Seeder
@Chema - its also worth noting that an iterator/async iterator can push with .next() too. So the iterator pulls from a source A, and also pushes back to the same source A. This is different than the push <-> pull example you provided which pull and push are from different source/targets.Falla
Their primary innovation is that the subscription itself is represented by a first-class object - isn't this essentially what the Promise type gives us: First class subscription to a continuation. So essentially an Observable is merely a stateless/lazy Promise.Equitable
S
6

My understanding of Async Iterators is a bit limited, but from what I understand WHATWG Streams are a special case of Async Iterators. For more information on this, refer to the Streams API FAQ. It briefly addresses how differs from Observables.

Both Async Iterators and Observables are generic ways to manipulate multiple asynchronous values. For now they do not interop but it seems creating Observables from Async Iterators is being considered. Observables by their push based nature are much more alike the current event system, AsyncIterables being pull based. A simplified view would be:

-------------------------------------------------------------------------    
|                       | Singular         | Plural                     |
-------------------------------------------------------------------------    
| Spatial  (pull based) | Value            | Iterable<Value>            |    
-------------------------------------------------------------------------    
| Temporal (push based) | Promise<Value>   | Observable<Value>          |
-------------------------------------------------------------------------    
| Temporal (pull based) | await on Promise | await on Iterable<Promise> |
-------------------------------------------------------------------------    

I represented AsyncIterables as Iterable<Promise> to make the analogy easier to reason about. Note that await Iterable<Promise> is not meaningful as it should be used in a for await...of AsyncIterator loop.

You can find a more complete explanation Kriskowal: A General Theory of Reactivity.

Slipshod answered 2/11, 2017 at 12:28 Comment(3)
I feel that your answer is helpful for a high-level comparison, but I disagree with the statement that AsyncIterables are Iterable<Promise>. An Iterable<Promise> is a synchronous iterable of promises, and has no concept of backpressure. You can consume it as fast as you want, no problem. AsyncIterables have backpressure, meaning it is illegal to call next() on the iterator before the previous iteration settles. It yields a Promise<{ value, done }>, it does not yield a { Promise<value>, done } like a synchronous iterator of promises does.Foundation
Ah, interesting difference. I did not think about this before. I wonder how calling next again is supposed to be handled. Return the same promise? Throw an error?Skull
Since Observables are push-based, it is easy for them to constantly pull from an AsyncIterator and emit as quickly as it can.Skull

© 2022 - 2024 — McMap. All rights reserved.