N720720/lindemann

Evaluate bottleneck dependency

Closed this issue · 2 comments

The dependency leads to problems with the numpy version. A benchmark should check whether we benefit here with bottleneck nanmean vs numpy nanmean.

>>> import bottleneck as bn
>>> bn.bench()
Bottleneck performance benchmark
    Bottleneck 1.3.8; Numpy 1.26.4
    Speed is NumPy time divided by Bottleneck time
    NaN means approx one-fifth NaNs; float64 used

              no NaN     no NaN      NaN       no NaN      NaN    
               (100,)  (1000,1000)(1000,1000)(1000,1000)(1000,1000)
               axis=0     axis=0     axis=0     axis=1     axis=1  
nansum         28.6        2.1        1.7        2.2        2.0
nanmean        80.2        1.4        1.4        1.3        1.4

It looks like there is no real advantage to using bottleneck in this synthetic benchmark except for very small arrays

A 1103 atom 5000 frames MD Lammps trajectory:
With bottleneck nanmean:

  • Mean time of 5 runs: 15.680200 s ± 0.155300 s

With numpy nanmean:

  • Mean time of 5 runs: 15.614600 s ± 0.139939 s