ChristianSch/skml

parallelization of predict methods

ChristianSch opened this issue · 0 comments

Well, when running any classifier (including PCC), the fitting works in no time, the predictions however take quite some time and in good Python manier it runs only on a single CPU core. We have 2018 though, and most processors have round about 4-6 cores with a bunch of threads, we should utilize this.
This issue tries to clarify if and how we can utilize multi-core CPUs properly.

The first idea is to parallelize the predict methods just like sklearn does, via Parallel:

        all_importances = Parallel(n_jobs=self.n_jobs,
                                   backend="threading")(
            delayed(getattr)(tree, 'feature_importances_')
for tree in self.estimators_)