scikit-learn-contrib/DESlib

`METADES` get index error when predicting testing data.

IxSxHxY opened this issue · 1 comments

I try to use METADES to predict the testing data, however it shows the index error.
The shape of the data is
X_train.shape = (5040, 192)
X_test.shape = (1260, 192)
Below is the implementation:

pool = [mlp1, mlp2, mlp3, mlp4, mlp5]
mt = METADES(pool)
mt.fit(X_train, y_train)

score = mt.score(X_test, y_test)

predict(), predict_proba(), score() works on training data but not testing data too.

Here is the error message:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[525], line 1
----> 1 kk.predict_proba(X_test)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\deslib\base.py:643, in BaseDS.predict_proba(self, X)
    638             DFP_mask = np.ones(
    639                 (ind_ds_classifier.size, self.n_classifiers_))
    641         ind_ds_original_matrix = ind_disagreement[ind_ds_classifier]
--> 643         proba_ds = self.predict_proba_with_ds(
    644             X[ind_ds_original_matrix],
    645             base_predictions[
    646                 ind_ds_original_matrix],
    647             base_probabilities[
    648                 ind_ds_original_matrix],
    649             neighbors=neighbors,
    650             distances=distances,
    651             DFP_mask=DFP_mask)
    653         predicted_proba[ind_ds_original_matrix] = proba_ds
    655 return predicted_proba

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\deslib\des\base.py:269, in BaseDES.predict_proba_with_ds(self, query, predictions, probabilities, neighbors, distances, DFP_mask)
    262     raise ValueError(
    263         'The arrays query and predictions must have the same number'
    264         ' of samples. query.shape is {}'
    265         'and predictions.shape is {}'.format(query.shape,
    266                                              predictions.shape))
    268 if self.needs_proba:
--> 269     competences = self.estimate_competence_from_proba(
    270         query,
    271         neighbors=neighbors,
    272         distances=distances,
    273         probabilities=probabilities)
    274 else:
    275     competences = self.estimate_competence(query,
    276                                            neighbors=neighbors,
    277                                            distances=distances,
    278                                            predictions=predictions)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\deslib\des\meta_des.py:483, in METADES.estimate_competence_from_proba(self, query, neighbors, probabilities, distances)
    479     meta_feature_vectors = np.digitize(meta_feature_vectors,
    480                                        np.linspace(0.1, 1, 10))
    482 # Get the probability for class 1 (Competent)
--> 483 competences = self.meta_classifier_.predict_proba(
    484     meta_feature_vectors)[:, 1]
    486 # Reshape the array from 1D [n_samples x n_classifiers]
    487 # to 2D [n_samples, n_classifiers]
    488 competences = competences.reshape(-1, self.n_classifiers_)

IndexError: index 1 is out of bounds for axis 1 with size 1

May I know what is the problem?

Edit 1: Change code snippet to python code snippet

@IxSxHxY Hello,

Sorry for the late response. I only came back to the library development & maintenance this week.

Just by the code you provide it is hard to identify the reason for such error which I never seen before in any of the examples and applications of this method. I would need to have more info on you data and models used. Can you provide me a full example? Also, did you try to run the other examples from the library and also try to run you code with a different DS model to see if they work?