zesterer/flume

Performance with many senders on multi-threaded runtime

Opened this issue · 0 comments

Hi,
I wrote a benchmark to compare two scenarios for collecting messages from many tokio tasks:

a) cloning the sender of a single channel
b) merging the receivers of many channels (with futures_buffered::FuturesUnordered)

This brought up a for me surprising result: flume performance seems to degrade a lot when a channel sender is cloned into many tasks on a multi-threaded tokio runtime. tokio::sync::mpsc seems to cope better with this setup.

I published the benchmark in this repo.
Full numbers below (release mode, Intel 8250u quad core laptop).

Benchmark results
running channel benchmark with 1048576 messages and capacity 1024

# tasks: 2, messages per task: 524288, capacity per task: 512

## multi_thread runtime
tokio (merged receivers)     203 ms    194 ns/message
tokio (cloned sender)        197 ms    188 ns/message
flume (merged receivers)     237 ms    226 ns/message
flume (cloned sender)        192 ms    184 ns/message

## current_thread runtime
tokio (merged receivers)     212 ms    202 ns/message
tokio (cloned sender)        145 ms    138 ns/message
flume (merged receivers)     116 ms    110 ns/message
flume (cloned sender)         72 ms     68 ns/message

# tasks: 16, messages per task: 65536, capacity per task: 64

## multi_thread runtime
tokio (merged receivers)     306 ms    292 ns/message
tokio (cloned sender)        421 ms    402 ns/message
flume (merged receivers)     369 ms    352 ns/message
flume (cloned sender)        647 ms    617 ns/message

## current_thread runtime
tokio (merged receivers)     230 ms    219 ns/message
tokio (cloned sender)        154 ms    147 ns/message
flume (merged receivers)     121 ms    115 ns/message
flume (cloned sender)         71 ms     68 ns/message

# tasks: 64, messages per task: 16384, capacity per task: 16

## multi_thread runtime
tokio (merged receivers)     375 ms    357 ns/message
tokio (cloned sender)        730 ms    697 ns/message
flume (merged receivers)     363 ms    346 ns/message
flume (cloned sender)       2895 ms   2761 ns/message

## current_thread runtime
tokio (merged receivers)     299 ms    285 ns/message
tokio (cloned sender)        190 ms    181 ns/message
flume (merged receivers)     137 ms    131 ns/message
flume (cloned sender)         79 ms     75 ns/message

# tasks: 256, messages per task: 4096, capacity per task: 4

## multi_thread runtime
tokio (merged receivers)     365 ms    348 ns/message
tokio (cloned sender)        766 ms    731 ns/message
flume (merged receivers)     348 ms    332 ns/message
flume (cloned sender)       5287 ms   5042 ns/message

## current_thread runtime
tokio (merged receivers)     385 ms    367 ns/message
tokio (cloned sender)        306 ms    292 ns/message
flume (merged receivers)     165 ms    157 ns/message
flume (cloned sender)        139 ms    133 ns/message

# tasks: 512, messages per task: 2048, capacity per task: 2

## multi_thread runtime
tokio (merged receivers)     382 ms    364 ns/message
tokio (cloned sender)        811 ms    774 ns/message
flume (merged receivers)     369 ms    352 ns/message
flume (cloned sender)       9240 ms   8812 ns/message

## current_thread runtime
tokio (merged receivers)     419 ms    399 ns/message
tokio (cloned sender)        343 ms    327 ns/message
flume (merged receivers)     198 ms    189 ns/message
flume (cloned sender)        313 ms    298 ns/message