alangpierce/sucrase

Add async parallel helper and update benchmarks to use it

alangpierce opened this issue · 0 comments

Currently, the README benchmark measures the single-threaded/per-core performance of each tool. For Sucrase, Babel, and TypeScript, this is the easiest way to invoke the tools, but for swc and esbuild, I took some care to make sure they are invoked in a single-threaded way for the sake of a fair comparison. This approach has received some skepticism, e.g. https://twitter.com/JakeDChampion/status/1384919314755637248 , so it would probably be more convincing to build a (hopefully) simple orchestration layer that runs Sucrase, Babel, and TypeScript in parallel and post those numbers as the primary benchmark. I'm thinking that I'll also link to a more detailed write-up with an explanation of the caveats and some reproducible benchmark numbers to illustrate the effects of parallelism and JIT warm-up.

While I still think that per-core performance is a fine way to measure these tools (given that transpilation is inherently parallelizable without needing the tool to support it), it's a very fair criticism that swc and esbuild make it easy to parallelize work, while Sucrase expects you to set it up yourself. I'm hoping that part of this work can be to provide an async transform that runs Sucrase on a background thread (or process) with reasonable defaults. Of course, this mechanism isn't as portable as a plain JS function, so I'll want to make sure people can still use the regular transform function without expecting Node. This function will hopefully make it easier for people to do their own comparisons, and may be useful for a built-in Node ESM loader in the future as well.

As a very simple proof of concept, I tested the peak parallel performance of each tool by running each in multiple terminal windows on the same machine, continuously in a loop and reporting its speed at each iteration. With just one terminal window, parallel swc and parallel esbuild are a little faster than Sucrase, but Sucrase has a much faster combined speed across four terminal windows. On my laptop, Sucrase maxes out at about 3 million lines per second, swc maxes out at about 1.2 million lines per second, and esbuild maxes out at about 0.8 million lines per second, which is consistent with my single-core measurements allowing for JIT warm-up.

My current plan is to use either https://github.com/piscinajs/piscina or https://github.com/tinylibs/tinypool for parallel orchestration, though it looks like I may need to work through some performance issues of those libraries when handling many small tasks.