timdp/es6-promise-pool

Sharing ideas

vitaly-t opened this issue · 3 comments

First of all, thank you for your library! I wish I would have found it earlier. Since I couldn't find it earlier, I wrote my own approach to dealing with throttling and load balancing when it comes to promises: spex

I think we had similar ideas, but they became very different implementations nonetheless :) I think much of what you consider a pool-approach, I implemented as a sequence and page methods.

Once before I tried to pursue the author of Bluebird to throw in some basic methods for throttling promises, but nothing ever came out of it. And many people stumble upon the same issues on and on. Bluebird guys would always throw at you methods like .settle and .reduce, but they aren't an adequate solution at all.

Anyhow, if you want to share some thoughts on it, it will be interesting.

Cheers!

Thanks, I'll definitely take a look at spex soon! One thing I've learned since I wrote es6-promise-pool is that there are a lot of scenarios and edge cases for parallel processing, so I certainly appreciate any alternative approach.

You might also be interested in chiffchaff, which I cooked up for a project at work. It's mostly a wrapper around bluebird (hence the name). chiffchaff-multi is where the parallel stuff lives. It's still early stages, so there's bound to be breakage, but the general idea is to wrap bluebird promises in Task instances which take care of progress notification, cancellation, and wireup.

I've spent a whole day trying to come up with a proper way to marry a stream interface with promises, but it is real tough, even just for the Readable. Streams seem to be able to break things in places where it is difficult to understand what is going on.

I also spent a bit of time trying to help this guy here: NodeJS, promises, streams - processing large CSV files

But in the end he came up with his own solution, while everything that I tried didn't really work. Even tried to use promise-streams, but couldn't get it to be useful in my case.

I wish there was a library with concise interface for processing streams as promises, but I'm yet to find one. And writing it myself seems like a serious task.

If you don't find this relevant to your own library, that's ok. I just think that data streaming in NodeJS is where promises really matter, but nobody yet has been up to the challenge to solve it properly.

Streams seem to be able to break things in places where it is difficult to understand what is going on.

I think you can still handle those by translating them to subclasses of Error, which bluebird lets you type-check in the catch handler. The challenge is to exhaustively map all error event types for your particular stream class to error classes.

One thing that still concerns me is that there's no unified concept of "progress". What I did with chiffchaff is treat progress as a monotonous, continuous scale between two bounds, but it might as well be a discrete or even complex number, or even a compound type, e.g., a number with a unit, or a 3D point for whatever reason. Streams (and observables in general) don't enforce anything in terms of progress reporting, but I did need a unified scale in order to create a combination of progress indicators. I briefly considered adding options to pass your own projection function that would take in a mixed value for progress and essentially project it to a number, but the concept seemed too convoluted.

If you don't find this relevant to your own library, that's ok.

No, I think it's totally relevant because it touches upon the subject of expanding beyond es6-promise-pool and aggregating the pooled results. As you said, batching, paging and throttling are common scenarios, and spex certainly looks useful to me in those cases. In theory, spex could use es6-promise-pool internally, but if it ain't broke, I certainly won't encourage you to fix it.

I just think that data streaming in NodeJS is where promises really matter, but nobody yet has been up to the challenge to solve it properly.

This, I don't fully agree with though. I'm not convinced that promises are such a good fit for streams, for the very reason that they don't define any behavior between instantiation and settling (which is what I call progress reporting above and part of why I built chiffchaff). Streams, on the other hand, don't have any concept of settling, but constantly report progress through arbitrary events; in Node, they're not entirely arbitrary because there's some consensus thanks to standard-ish names like "data", "done", and "error", but they're not as rigorous as a promise settling. That's also the reason why you can't just wrap any stream in a bluebird promise.

That being said, even though there isn't a 1:1 match between streams and promises, there's certainly some overlap, and I think there's a lot of room for improvement by coming up with a unified concept that tracks progress (like an observable) as well as state (like a promise). Again, chiffchaff partly accomplishes that, but I'm not entirely satisfied with it myself. And as long as there's no standard model for progress reporting for a stream or even an observable, you're always going to end up writing error-prone boilerplate-ish code.