Question about Performance
FranzForstmayr opened this issue · 2 comments
From your Readme:
The codec has quite reasonable performances if you either use PyPy on the pure-python implementation (reedsolo.py) or either if you compile the Cython extension creedsolo.pyx (which is about 2x faster than PyPy). You can expect encoding rates of several MB/s.
Is this still valid?
I just did a performance evaluation for three different python versions (3.7, 3.8, 3.9) with the following code.
import sys
import numpy as np
from reedsolo import RSCodec
import perfplot
name = f'perf_v{sys.version_info.major}.{sys.version_info.minor}'
def func(rscoder, array):
enc = rscoder.encode(array)
return rscoder.decode(enc)[0]
codecs = [
RSCodec(8),
RSCodec(16),
]
out = perfplot.bench(
setup = lambda n: np.random.randint(0,255,size=n, dtype=np.uint8),
kernels = [
lambda a: func(codec, a) for codec in codecs
],
labels = [codec.nsym for codec in codecs],
n_range = [2 ** k for k in range(20)],
)
out.show()
out.save(name + ".png", transparent=True, bbox_inches="tight")
I get a maximum of 100kB/s (for encoding and decoding together).
Is this an expected speed? Tested on Ubuntu 20.04, cython is installed.
Here is the output of the three perfplots.
I expected the cythonized function to be faster. However the number of ecc symobls seems to be not relevant here.
PS:
To reproduce with python3.7, you'll have to install perfplot==0.9.6
Follow-up on this: thank you very much @FranzForstmayr for your code snippet, I have reworked it a bit to add the Cythonized extension and it is now merged in tests/perf.py
. However, as you pointed out, perfplot does not show any different performance between the cythonized extension and pure python? This is very strange, and I don't know the reason why.
Anyway, I have recently reworked the cythonized extension, and I can confirm that it now runs even much faster than before, at 12.5 MB/s encoding on my 5 years old laptop. I tested the performance with another tool I made here, which is the one I used in the past to derive the speed results I cited above and in the README, so the speed is against a comparable basis (although old tests were done under Python 2.7, and nowadays under Python 3.10).
I did not yet test the results with PyPy, but I also expect an improvement since I have merged some optimizations I did with Cython into the pure python version too (such as pre-allocating byterrays).
I will now close this issue as the cythonized extension provides for sure > 10 MB/s of encoding speed, and more improvements are under way. Please let me know if you still encounter any issue.