NebulousLabs/fastrand

Performance Degrades Substantially with Many Cores

Closed this issue · 3 comments

From Reddit:

I downloaded it last night and rand the benchmarks on my workstation, which has dual 24 core xeons. Your lib int32 was only 30mb vs 15mb from stdlib. I imagine on faster consumer machines with far less cores it would do much better, I didn't test it on my desktop machine.

https://www.reddit.com/r/golang/comments/616czq/fast_replacement_for_cryptorand_10x_faster_no/dfcplpx/

We should consider reverting to a single-threaded design. The speed loss is painful but it seems like people really don't like invisible concurrency (and I kinda agree). Perhaps we could provide both, with the single-threaded version being the default exported Reader, and a multi-threaded version being available if you call New.

Speed loss isn't even that bad actually. For now we can probably just strip out the parallelism, it's still a lot faster than crypto/rand.

I want to think about it a little more actually. I'm guessing there's some way we can get it so that user-called threads can fetch entropy simultaneously without running into race conditions, while also not needing locks or channels.

Closed in #6