lbdremy/solr-node-client

Replace callback-based API with promise-based async functions

kibertoad opened this issue · 37 comments

Replace Request with superagent (which is faster: https://github.com/fdesjardins/node-http-client-bench) or axios (which is more popular) for promise support.

Undici is an even faster option (x2-x3 speed-up), but since it does not rely on http module, it would break nock-based mocks if users had any, which makes me hesitant.

x2-x3 speedup in what sense? The biggest time consumers are always sending / waiting for / receiving the network response.

I would prefer axios, because I favour maintainability. Having a well-known / popular package is better in that regard. I don't have much time left to work on this repo this month.

This issue replaces #252

This issue replaces #251 , too!

@rudolfbyker See https://www.youtube.com/watch?v=tr057sP4VBM for more details on Undici and overall deep-dive into HTTP client performance.
Well, superagent is also a well-supported and popular library, we are talking 4 million weekly downloads vs 15 millions weekly downloads, so both of them are very mature and popular.

See https://github.com/fdesjardins/node-http-client-bench for (somewhat outdated) benchmarks.

I would still prefer using Undici if we are ok with a breaking change that would force people to redo mocks with builtin Undici mocking, but I would need your input on that.

I'm not concerned with breaking changes. We are moving towards the first major release, so I think we should be relentless. If we had a big software house backing the project, that would be a different story.

I'll check out undici and get back to you. I need to get this done today.

OK, I see the advantage for situations where you have many small requests in a short time span. Like when doing autocomplete queries.

Undici does not support providing a custom HTTP agent. So we will have to drop that. (If I understand correctly from the, Undici IS an HTTP agent? But I'm confused about that.)

@rudolfbyker Yes, more or less. Since it does not use http Node module (it is more low-level than that, it operates directly on net and tls parts of Node), it does not support an http Agent either.

Let's also get rid of the custom error types and let Undici just do its thing. Extending errors in JS is problematic at best, and so we should not force this on users of our library.

Yeah, fair.

I'm a bit stuck on refactoring the createAddStream method. I'm not quite sure what it's supposed to do, or how to refactor the "duplexing".

@rudolfbyker It's a way to create a large amount of documents in bulk. Imagine that you have a billion of DB entries that you want indexed. What you might want to do is read them from DB as a stream and keep indexing them as you go.
I'm not sure you need to refactor much there, just replacing underlying Request call with Undici call should suffice. This is one case where still exposing a callback is a good idea, because that's how streams are typically consumed.

I've never worked with streams in JS before, so I'm having a hard time wrapping my head around this. Should I be using undici.stream or unidici.pipeline?

@rudolfbyker I don't think you strictly need to use either, original implementation just takes a plain simple Request call and wraps a Duplex (which writes to request and reads from response) around it.
Now if you would use undici.pipeline, it should do all that job for you, Dispatcher.pipeline should create exactly the kind of Duplexer that we create ourselves now.
I would recommend against refactoring this part right now, though. There are no tests, so difficult to say if we break something in a major way. I suggest doing bare minimum change (replace simple Request call with simple callback-based Undici call) for now, and later on I will write a test and attempt a proper refactoring. Does that work for you?

The thumbs up was to say that I agree with the approach in principle, but I have been at this for a long time now, and can't get it to work. Do you have time to look into refactoring the createAddStream function specifically? It could even be a separate PR on its own, while everything else still uses requests.

Sure, I'll take a look. Can you push your branch?

I will have to stop work for today, as I'm starting to feel unwell. Thanks for the great collaboration!

@rudolfbyker Please feel better soon!
I'll try to take it over from here. Thank you, it's fun :)

@rudolfbyker I assume you didn't get simple Undici requests for basic operations like search to work either yet?

ok, I'll avoid touching those parts yet and focus only on streams

can you push your half-done changes as well then?

thanks!

Sorry for the delay. Pushed now. See commit messages for details.

I'm going to take another stab at this, if you haven't started.

not yet :). thanks!

OK, that's all I have time for now.

Thank you! Hopefully will be able to continue the subject this week.

@rudolfbyker I will still add test for streaming functionality, and @AlinaKul will update the tests. Is there anything else left beyond that?

I cannot remember. You will have to dive into this. Not sure when I will have time to continue with this one.

@rudolfbyker Gotcha, will do! Thank you for the incredible work you've done on this!