/scikit-learn_bench

scikit-learn_bench benchmarks various implementations of machine learning algorithms across data analytics frameworks. It currently support the scikit-learn, DAAL4PY, cuML, and XGBoost frameworks for commonly used machine learning algorithms.

Primary LanguagePythonApache License 2.0Apache-2.0

Machine Learning Benchmarks

Build Status

Scikit-learn_bench is a benchmark tool for libraries and frameworks implementing Scikit-learn-like APIs and other workloads.

Benefits:

  • Full control of benchmarks suite through CLI
  • Flexible and powerful benchmark config structure
  • Available with advanced profiling tools, such as Intel(R) VTune* Profiler
  • Automated benchmarks report generation

📜 Table of Contents

🔧 Create a Python Environment

How to create a usable Python environment with the following required frameworks:

  • sklearn, sklearnex, and gradient boosting frameworks:
# with pip
pip install -r envs/requirements-sklearn.txt
# or with conda
conda env create -n sklearn -f envs/conda-env-sklearn.yml
  • RAPIDS:
conda env create -n rapids --solver=libmamba -f envs/conda-env-rapids.yml

🚀 How To Use Scikit-learn_bench

Benchmarks Runner

How to run benchmarks using the sklbench module and a specific configuration:

python -m sklbench --config configs/sklearn_example.json

The default output is a file with JSON-formatted results of benchmarking cases. To generate a better human-readable report, use the following command:

python -m sklbench --config configs/sklearn_example.json --report

By default, output and report file paths are result.json and report.xlsx. To specify custom file paths, run:

python -m sklbench --config configs/sklearn_example.json --report --result-file result_example.json --report-file report_example.xlsx

For a description of all benchmarks runner arguments, refer to documentation.

Report Generator

To combine raw result files gathered from different environments, call the report generator:

python -m sklbench.report --result-files result_1.json result_2.json --report-file report_example.xlsx

For a description of all report generator arguments, refer to documentation.

Scikit-learn_bench High-Level Workflow

flowchart TB
    A[User] -- High-level arguments --> B[Benchmarks runner]
    B -- Generated benchmarking cases --> C["Benchmarks collection"]
    C -- Raw JSON-formatted results --> D[Report generator]
    D -- Human-readable report --> A

    classDef userStyle fill:#44b,color:white,stroke-width:2px,stroke:white;
    class A userStyle
Loading

📚 Benchmark Types

Scikit-learn_bench supports the following types of benchmarks:

  • Scikit-learn estimator - Measures performance and quality metrics of the sklearn-like estimator.
  • Function - Measures performance metrics of specified function.

📑 Documentation

Scikit-learn_bench: