On occassions, cross validation will fail because sampling picks sets belonging to a single class
Opened this issue · 1 comments
pcm32 commented
I hope I'm reading this error correctly @chichaumiau, but I guess we need to protect the cross validation from these sort of situations: if sampled set only contains a single class, then re-sample. See error here:
pcm32 commented
Error is:
INFO:root:Run optimise: DONE
/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sklearn/model_selection/_validation.py:530: FutureWarning: From version 0.22, errors during fit will result in a cross validation score of NaN by default. Use error_score='raise' if you want an exception raised or error_score=np.nan to adopt the behavior from version 0.22.
FutureWarning)
Traceback (most recent call last):
File "/home/travis/virtualenv/python3.6.7/bin/sccaf", line 120, in <module>
n_jobs=args.cores)
File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/SCCAF/__init__.py", line 265, in SCCAF_assessment
return self_projection(*args, **kwargs)
File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/SCCAF/__init__.py", line 352, in self_projection
cvs = cross_val_score(clf, X_train, np.array(y_train), cv=cv, scoring='accuracy', n_jobs=n_jobs)
File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sklearn/model_selection/_validation.py", line 391, in cross_val_score
error_score=error_score)
File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sklearn/model_selection/_validation.py", line 232, in cross_validate
for train, test in cv.split(X, y, groups))
File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/joblib/parallel.py", line 1003, in __call__
if self.dispatch_one_batch(iterator):
File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/joblib/parallel.py", line 834, in dispatch_one_batch
self._dispatch(tasks)
File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/joblib/parallel.py", line 753, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 201, in apply_async
result = ImmediateResult(func)
File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 582, in __init__
self.results = batch()
File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/joblib/parallel.py", line 256, in __call__
for func, args, kwargs in self.items]
File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/joblib/parallel.py", line 256, in <listcomp>
for func, args, kwargs in self.items]
File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sklearn/model_selection/_validation.py", line 516, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sklearn/linear_model/logistic.py", line 1549, in fit
sample_weight=sample_weight)
File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sklearn/svm/base.py", line 879, in _fit_liblinear
" class: %r" % classes_[0])
ValueError: This solver needs samples of at least 2 classes in the data, but the data contains only one class: '0'