snuspl/nimble

The effect of multiple streams is not obvious

aodongchen opened this issue · 0 comments

Hi, I'm trying to reproduce nimble's experimental results. However, I found that the effect of multi-stream has little effect on the inference latency, but the paper says that it can be up to 1.8×, maybe I have something wrong, I hope you can give me some advice.
I successfully installed nimble in docker:
GPU: 2080s with 8G global memory
Ubuntu 18.04.6 LTS

# inception_v3 [1, 3, 299, 299]
         mean (ms)  stdev (ms)
pytorch   8.212887    0.211187

        mean (ms)  stdev (ms)
nimble    2.24783    0.003427

              mean (ms)  stdev (ms)
nimble-multi    2.31407    0.009554
# inception_v3 [8, 3, 299, 299]
         mean (ms)  stdev (ms)
pytorch  25.678553    0.287919

        mean (ms)  stdev (ms)
nimble  17.354554    0.065831

              mean (ms)  stdev (ms)
nimble-multi  16.428471    0.104019
# densenet201 [1, 3, 224, 224]
         mean (ms)  stdev (ms)
pytorch  29.020667    0.231637

        mean (ms)  stdev (ms)
nimble   5.537937    0.004089

              mean (ms)  stdev (ms)
nimble-multi   5.572467    0.004977
# densenet201 [8, 3, 224, 224]
         mean (ms)  stdev (ms)
pytorch  31.046828    0.164185

        mean (ms)  stdev (ms)
nimble  24.178936    0.032238

              mean (ms)  stdev (ms)
nimble-multi  24.125336    0.060498
# mnasnet0_5 [1, 3, 224, 224]
         mean (ms)  stdev (ms)
pytorch   4.477023    0.025759

        mean (ms)  stdev (ms)
nimble   0.565598    0.002112

              mean (ms)  stdev (ms)
nimble-multi   5.572467    0.004977

# mnasnet0_75 [1, 3, 224, 224]
         mean (ms)  stdev (ms)
pytorch   4.557251    0.037832

        mean (ms)  stdev (ms)
nimble    0.68727    0.002274

              mean (ms)  stdev (ms)
nimble-multi   0.679038    0.002025

# mnasnet1_3 [1, 3, 224, 224]
         mean (ms)  stdev (ms)
pytorch   4.780402     0.02905

        mean (ms)  stdev (ms)
nimble   0.950962     0.00627

              mean (ms)  stdev (ms)
nimble-multi   0.893742     0.06838
# mnasnet1_3 [8, 3, 224, 224]
         mean (ms)  stdev (ms)
pytorch   6.076544    0.567386

        mean (ms)  stdev (ms)
nimble   4.953977    0.023374

              mean (ms)  stdev (ms)
nimble-multi   4.976105    0.025923