python/pyperformance

Add benchmarks for machine learning application

corona10 opened this issue · 4 comments

Today, Machine learning applications are important use-cases of Python.
I haven't prepared concrete benchmark implementations yet, but I would like to suggest guidelines for machine learning benchmarks.

A. Each benchmark should provide all of the following implementations and shows the same result.

  • Pure Python-based implementation (might be not easy). Sympy based implementation
  • Numpy-based implementation.
  • (optional) Famous frameworks like scikit-learn, TensorFlow, or PyTorch-based implementation.

B. Following algorithm-based benchmark should provide training and inference benchmark.

  • Regression algorithm
  • Decision tree algorithm
  • Clustering algorithm
  • Nearest neighborhood algorithm
  • Matrix factorization
  • ... (Please suggest!)

C. Deep learning-based or neural network-based benchmarks only provide inference benchmark with fixed weights since training benchmark needs GPU resources but using GPU resource is out of the topic.

  • Simple neural network
  • ... (Please suggest!)

I change my mind about Pure Python-based implementation.
Sympy-based implementation will be enough since Sympy is implemented by pure Python.
I expect that it can reduce the difficulty of implementation.

I wonder if Pyston's bm_pytorch_alexnet_inference would fit the bill? The only hiccup that we've run into is that PyTorch doesn't release wheels for early pre-releases of CPython, and building it from source is really hard.

That might be okay, though. Even if we can't run it during pre-releases to inform our work then, we can still run it between stable versions of CPython. If we see that it got X% faster from 3.10 to 3.11, great! If not, we can still gather stats, etc. and use them to inform 3.12 work, even if the feedback loop isn't as tight as we would prefer.

CC @mdboom.

I wonder if Pyston's bm_pytorch_alexnet_inference would fit the bill? The only hiccup that we've run into is that PyTorch doesn't release wheels for early pre-releases of CPython, and building it from source is really hard.

It will only cover case C.

If Python is satisfied with its position as a glue language in machine learning applications, it will be sufficient.
(And I think that we should not)
It's a different point of view, but if languages ​​like Julia try to position themselves as an alternative to Python, we also need to improve similar workloads with pure Python code as a competitor.

I am considering to add the subset of https://github.com/mlcommons/inference
It's a de-factor benchmark suite of the NPU inference benchmark.