Performance Degrades Substantially with Many Cores
Closed this issue · 3 comments
From Reddit:
I downloaded it last night and rand the benchmarks on my workstation, which has dual 24 core xeons. Your lib int32 was only 30mb vs 15mb from stdlib. I imagine on faster consumer machines with far less cores it would do much better, I didn't test it on my desktop machine.
We should consider reverting to a single-threaded design. The speed loss is painful but it seems like people really don't like invisible concurrency (and I kinda agree). Perhaps we could provide both, with the single-threaded version being the default exported Reader
, and a multi-threaded version being available if you call New
.
Speed loss isn't even that bad actually. For now we can probably just strip out the parallelism, it's still a lot faster than crypto/rand.
I want to think about it a little more actually. I'm guessing there's some way we can get it so that user-called threads can fetch entropy simultaneously without running into race conditions, while also not needing locks or channels.
Closed in #6