danielhomola/mifs

Problem with MRMR method

Opened this issue · 5 comments

Hi! I'm having some trouble with the MRMR method when testing this code:

X = np.random.random((9000,20))
y = np.zeros(9000, dtype=int)
y[np.random.random(9000)>0.5] = 1

# define MI_FS feature selection method
feat_selector = mifs.MutualInformationFeatureSelector(method='MRMR', categorical=True, n_features='auto')

# find all relevant features
feat_selector.fit(X, y)

# check selected features
print(feat_selector.support_)

# check ranking of features
print(feat_selector.ranking_)

# call transform() on X to filter it down to selected features
X_filtered = feat_selector.transform(X)
print(X_filtered)

I get this error:

Traceback (most recent call last):
  File "test_mifs.py", line 17, in <module>
    feat_selector.fit(X, y)
  File "/home/martin/Repositories/svm/lib/mifs.py", line 137, in fit
    return self._fit(X, y)
  File "/home/martin/Repositories/svm/lib/mifs.py", line 237, in _fit
    selected = F[bn.nanargmax(MRMR)]
  File "reduce.pyx", line 2907, in reduce.nanargmax (bottleneck/src/auto_pyx/reduce.c:25633)
  File "reduce.pyx", line 3552, in reduce.reducer (bottleneck/src/auto_pyx/reduce.c:31009)
  File "reduce.pyx", line 2943, in reduce.nanargmax_all_float64 (bottleneck/src/auto_pyx/reduce.c:25949)
ValueError: All-NaN slice encountered

the other two methods don't have any problem. I'm working with anaconda2 environment. I would appreciate your help!

Hi,

the MRMR implementation isn't perfect (as you can clearly see). Tbh I never really used it that much because in most cases it gave inferior results to the JMI method.. from the error message though it seems like all values are nans inside MRMR.. you'd need to debug this yourself I'm afraid as I don't have time for this now..

Hi Daniel,

I will debug your code and. I tell you if I catch the problem!

@mavillan Hi
This error is due to this and this line. If they will be replaced by zero, this result will be displayed:

C:\Users\Markazi.co\Anaconda3\python.exe D:/mifs-master_2/mifs-master/untitled0.py
Auto selected feature #1 : 3, JMI: 0.008148984893969091
Auto selected feature #2 : 8, JMI: 1.3068670221582526
Auto selected feature #3 : 13, JMI: 0.5557899561311688
Auto selected feature #4 : 4, JMI: 0.0
Auto selected feature #5 : 11, JMI: 0.0
Auto selected feature #6 : 9, JMI: 0.0
Auto selected feature #7 : 10, JMI: 0.0
Auto selected feature #8 : 7, JMI: 0.0
Auto selected feature #9 : 1, JMI: 0.0
Auto selected feature #10 : 5, JMI: 0.0
Auto selected feature #11 : 12, JMI: 0.0
Auto selected feature #12 : 6, JMI: 0.0

Process finished with exit code 0

Can you please try the latest version of the code and report back if you still encounter the bug? Thanks!

Hi! Thanks for making this nice tool!

I think the problem still sits there,

feat_selector = mifs.MutualInformationFeatureSelector(method="MRMR",
                                                      categorical=False,
                                                      verbose=2)

feat_selector.fit(df_X.to_numpy(), y)
Auto selected feature #1 : 49, MRMR : 0.31481619092341173
Auto selected feature #2 : 318, MRMR : -0.06933974432983714
Auto selected feature #3 : 117, MRMR : -0.1471495713561355
Auto selected feature #4 : 50, MRMR : -0.13565179463330096
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-72-3b239f510761> in <module>
----> 1 feat_selector.fit(df_X.to_numpy(), y)

~/softs/miniconda3/envs/solv_py37/lib/python3.7/site-packages/mifs/mifs.py in fit(self, X, y)
    208                     break
    209                 MRMR = xy_MI[F] - bn.nanmean(fmm, axis=0)
--> 210                 selected = F[bn.nanargmax(MRMR)]
    211                 S_mi.append(bn.nanmax(MRMR))
    212 

ValueError: All-NaN slice encountered

Can you help fix it? Thank you.
@danielhomola