The prediction results of kmean are all the same & cause the svm error
cysun0226 opened this issue · 3 comments
Hi,
Recently I am trying to use LA-MCTS to optimize my own task, but some errors occurred during the execution.
My task has 9 dimensions, and the HP setting is:
Cp
= 10 ( follow the suggestion from the paper, ~= 10% of max f(x) )leaf_size
= 10ninits
= 40kernel_type
= "rbf"
Here is the error log:
Traceback (most recent call last):
File "mcts-exec.py", line 30, in <module>
agent.search(iterations = args.iterations)
File "/root/workspace/lamcts/MCTS.py", line 239, in search
self.dynamic_treeify()
File "/root/workspace/lamcts/MCTS.py", line 116, in dynamic_treeify
bad_kid.update_bag( bad_kid_data )
File "/root/workspace/lamcts/Node.py", line 83, in update_bag
self.is_svm_splittable = self.classifier.is_splittable_svm()
File "/root/workspace/lamcts/Classifier.py", line 55, in is_splittable_svm
self.learn_boundary(plabel)
File "/root/workspace/lamcts/Classifier.py", line 410, in learn_boundary
self.svm.fit(self.X, plabel)
File "/opt/app-root/lib/python3.6/site-packages/sklearn/svm/_base.py", line 173, in fit
y = self._validate_targets(y)
File "/opt/app-root/lib/python3.6/site-packages/sklearn/svm/_base.py", line 560, in _validate_targets
" class" % len(cls))
ValueError: The number of classes has to be greater than one; got 1 class
Since it occurs at Classifier.learn_clusters()
, I print the result of self.kmean.predict(tmp)
and find some clues:
- The error will occur if
plabel
(the results ofself.kmean.predict(tmp)
) are all the same
# normal
plabel: [0 1 1 0 0 0 1 0 0 1 1]
# exception
plabel: [0 0 0 0 0]
I temporarily avoid this exception by making is_splittable_svm
return False when plabel
only contains a single class.
However, I would like to know that is it possible to happen in the general case? Or it may be caused by my own function?
Thank you for the work & sharing your code!
Hello,
Thank you for submitting this issue, we're aware of this problem. For most of the problems, the current code should be fine.
Here are suggested solutions,
- normalize your f(x) inside your function.
- change the leaf size to a larger size.
- instead of using Kmean, we can learn a linear or non-linear regressor to separate the samples into two groups, rather using K-mean. We will release this feature after NeurIPS of this year.
Thanks for the reply and suggestions!
Look forward to your future works & update 🙌
Thank you. Before I will close this issue, can I ask if normalization or increasing the leaf size help you resolve the issue? Thanks.
please note if each elements of x and f(x) in a very small range, [-0.01, 0.01], or entires of [x, f(x)] are very similar this scenario can happen.