This is a project used to benchmark the operators (functions) of the python librarieswhich have compatible API with Numpy, now it can generate some reports for operators in MXNet (new numpy programming style), ChainerX and JAX.
I divide opertors into several categories:
- Common operators, those can be found under
numpy
- FFT operators, those can be found under
numpy.fft
- Linear algebra operators, those can be found under
numpy.linalg
- Random operators, those can be found under
numpy.random
Totally, there are 497 operators generated.
MXNet | ChainerX | JAX |
---|---|---|
17.8% | 23.8% | 42.1% |
For users:
pip install git+https://github.com/hgt312/NumpyXBench
For developer (necessary for report generation):
git clone https://github.com/hgt312/NumpyXBench.git
cd NumpyXBench/
pip install -e .
-
Install MXNet from source:
http://mxnet.incubator.apache.org/versions/master/install/ubuntu_setup.html
With TVM support, add
-DUSE_TVM_OP=ON
. -
Install Jax
# CPU-only version pip install --upgrade jax jaxlib # with GPU supported PYTHON_VERSION=cp37 # alternatives: cp27, cp35, cp36, cp37 CUDA_VERSION=cuda92 # alternatives: cuda90, cuda92, cuda100 PLATFORM=linux_x86_64 # alternatives: linux_x86_64 BASE_URL='https://storage.googleapis.com/jax-releases' pip install --upgrade $BASE_URL/$CUDA_VERSION/jaxlib-0.1.28-$PYTHON_VERSION-none-$PLATFORM.whl pip install --upgrade jax # install jax
-
Install ChainerX
# CPU-only version export CHAINER_BUILD_CHAINERX=1 export MAKEFLAGS=-j8 # Using 8 parallel jobs. pip install --pre chainer # with GPU supported export CHAINER_BUILD_CHAINERX=1 export CHAINERX_BUILD_CUDA=1 export CUDNN_ROOT_DIR=path/to/cudnn export MAKEFLAGS=-j8 # Using 8 parallel jobs. pip install --pre cupy pip install --pre chainer
cd doc
pip install -r requirements.txt
CUDA_VISIBLE_DEVICES=-1 python -m NumpyXBench.tools --warmup 10 --runs 25 --device cpu --info "MacBook Pro, CPU"
sphinx-build -b html . _build/cpu -A current_device=CPU
CUDA_VISIBLE_DEVICES=-1 python -m NumpyXBench.tools --warmup 10 --runs 25 --device cpu --info "[Machine infomation]"
sphinx-build -b html . _build/cpu -A current_device=CPU
CUDA_VISIBLE_DEVICES=0 python -m NumpyXBench.tools --warmup 10 --runs 25 --device gpu --info "[Machine infomation]"
sphinx-build -b html . _build/gpu -A current_device=GPU
- Obtain an op from a toolkit which contains its default config
from NumpyXBench.toolkits import add_toolkit
toolkit = add_toolkit
op = toolkit.get_operator_cls()('np')
config = toolkit.get_random_config_func('RealTypes')()
res = toolkit.get_benchmark_func()(op, config, 'forward')
- Another more flexible way.
from NumpyXBench.operators import Add
from NumpyXBench.configs import get_random_size_config
from NumpyXBench.utils import run_binary_op_benchmark
op = Add(backend='numpy')
config = get_random_size_config()
res = run_binary_op_benchmark(op, config, 'forward')
- On multiple frameworks.
from NumpyXBench.toolkits import add_toolkit
from NumpyXBench.utils import run_op_frameworks_benchmark
res = run_op_frameworks_benchmark(*add_toolkit.get_tools('AllTypes'), ['mx', 'np', 'chx', 'jax'], 'forward')
- Test all registered toolkits and brief visualization.
from NumpyXBench.tools import test_all_operators, draw_one_plot, test_operators
from NumpyXBench import toolkits
res = test_operators([toolkits.mod_toolkit, toolkits.multiply_toolkit], is_random=False, dtypes=['float32'], times=6, warmup=3, runs=5)
# res = test_all_operators(is_random=False, dtypes=['float32'], times=6, warmup=1, runs=2)
draw_one_plot('mod', res['mod'], mode='note', info='mbp, cpu') # use notebook to see the plot
- Test coverage (only for frameworks that has same API with NumPy).
from NumpyXBench.tools import test_numpy_coverage
res = test_numpy_coverage('jax') # res = {'passed': [...], 'failed': [...]}
print(len(res['passed']) / (len(res['passed']) + len(res['failed'])))
Refer to Development Doc.