SCCAF/sccaf

On occassions, cross validation will fail because sampling picks sets belonging to a single class

Opened this issue · 1 comments

pcm32 commented

I hope I'm reading this error correctly @chichaumiau, but I guess we need to protect the cross validation from these sort of situations: if sampled set only contains a single class, then re-sample. See error here:

https://travis-ci.com/SCCAF/sccaf/builds/131297330#L632

pcm32 commented

Error is:

INFO:root:Run optimise: DONE
/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sklearn/model_selection/_validation.py:530: FutureWarning: From version 0.22, errors during fit will result in a cross validation score of NaN by default. Use error_score='raise' if you want an exception raised or error_score=np.nan to adopt the behavior from version 0.22.
  FutureWarning)
Traceback (most recent call last):
  File "/home/travis/virtualenv/python3.6.7/bin/sccaf", line 120, in <module>
    n_jobs=args.cores)
  File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/SCCAF/__init__.py", line 265, in SCCAF_assessment
    return self_projection(*args, **kwargs)
  File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/SCCAF/__init__.py", line 352, in self_projection
    cvs = cross_val_score(clf, X_train, np.array(y_train), cv=cv, scoring='accuracy', n_jobs=n_jobs)
  File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sklearn/model_selection/_validation.py", line 391, in cross_val_score
    error_score=error_score)
  File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sklearn/model_selection/_validation.py", line 232, in cross_validate
    for train, test in cv.split(X, y, groups))
  File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/joblib/parallel.py", line 1003, in __call__
    if self.dispatch_one_batch(iterator):
  File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/joblib/parallel.py", line 834, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/joblib/parallel.py", line 753, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 201, in apply_async
    result = ImmediateResult(func)
  File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 582, in __init__
    self.results = batch()
  File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/joblib/parallel.py", line 256, in __call__
    for func, args, kwargs in self.items]
  File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/joblib/parallel.py", line 256, in <listcomp>
    for func, args, kwargs in self.items]
  File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sklearn/model_selection/_validation.py", line 516, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sklearn/linear_model/logistic.py", line 1549, in fit
    sample_weight=sample_weight)
  File "/home/travis/virtualenv/python3.6.7/lib/python3.6/site-packages/sklearn/svm/base.py", line 879, in _fit_liblinear
    " class: %r" % classes_[0])
ValueError: This solver needs samples of at least 2 classes in the data, but the data contains only one class: '0'