This repository contains various MindsDB benchmarks.
- Run some quick benchmarks locally to check the performance:
python3 run.py --modes=mindsdb_dev --platform=local --speed=fast,medium,slow
- Compare your local version against sklearn (a naive and an expert implementation):
python3 run.py --modes=sklearn_naive,sklearn_expert,mindsdb_dev --platform=local --speed=fast,medium,slow
- Run benchmarks for current stable remotely:
python3 run.py --modes=mindsdb_prod --platform=GCP --speed=fast,medium,slow
[Implementation not done yet]
Before contributing to this repository please make sure that the dataset you are adding are publicly available and can be re-used.
- Add your dataset as a csv in a directory
datasets/{name_of_the_dataset}/data.csv
. - To specify an accuracy function to evaluate it with and other parameters edit
datasets/{name_of_the_dataset}/info.py
, see this file as an example. - To add an "alternatives" benchmark for the dataset add it to
alternatives/{alternative_name}/{name_of_the_dataset}/benchmark.py
. Currently the supported alternatives are sklearn_expert and sklearn_naive. For an example see this file.
If you found any issues with MindsDB when executing the benchmarks, please make sure you report them in the MindsDB repository.