Problem with MRMR method
Opened this issue · 5 comments
Hi! I'm having some trouble with the MRMR method when testing this code:
X = np.random.random((9000,20))
y = np.zeros(9000, dtype=int)
y[np.random.random(9000)>0.5] = 1
# define MI_FS feature selection method
feat_selector = mifs.MutualInformationFeatureSelector(method='MRMR', categorical=True, n_features='auto')
# find all relevant features
feat_selector.fit(X, y)
# check selected features
print(feat_selector.support_)
# check ranking of features
print(feat_selector.ranking_)
# call transform() on X to filter it down to selected features
X_filtered = feat_selector.transform(X)
print(X_filtered)
I get this error:
Traceback (most recent call last):
File "test_mifs.py", line 17, in <module>
feat_selector.fit(X, y)
File "/home/martin/Repositories/svm/lib/mifs.py", line 137, in fit
return self._fit(X, y)
File "/home/martin/Repositories/svm/lib/mifs.py", line 237, in _fit
selected = F[bn.nanargmax(MRMR)]
File "reduce.pyx", line 2907, in reduce.nanargmax (bottleneck/src/auto_pyx/reduce.c:25633)
File "reduce.pyx", line 3552, in reduce.reducer (bottleneck/src/auto_pyx/reduce.c:31009)
File "reduce.pyx", line 2943, in reduce.nanargmax_all_float64 (bottleneck/src/auto_pyx/reduce.c:25949)
ValueError: All-NaN slice encountered
the other two methods don't have any problem. I'm working with anaconda2 environment. I would appreciate your help!
Hi,
the MRMR implementation isn't perfect (as you can clearly see). Tbh I never really used it that much because in most cases it gave inferior results to the JMI method.. from the error message though it seems like all values are nans inside MRMR.. you'd need to debug this yourself I'm afraid as I don't have time for this now..
Hi Daniel,
I will debug your code and. I tell you if I catch the problem!
@mavillan Hi
This error is due to this and this line. If they will be replaced by zero, this result will be displayed:
C:\Users\Markazi.co\Anaconda3\python.exe D:/mifs-master_2/mifs-master/untitled0.py
Auto selected feature #1 : 3, JMI: 0.008148984893969091
Auto selected feature #2 : 8, JMI: 1.3068670221582526
Auto selected feature #3 : 13, JMI: 0.5557899561311688
Auto selected feature #4 : 4, JMI: 0.0
Auto selected feature #5 : 11, JMI: 0.0
Auto selected feature #6 : 9, JMI: 0.0
Auto selected feature #7 : 10, JMI: 0.0
Auto selected feature #8 : 7, JMI: 0.0
Auto selected feature #9 : 1, JMI: 0.0
Auto selected feature #10 : 5, JMI: 0.0
Auto selected feature #11 : 12, JMI: 0.0
Auto selected feature #12 : 6, JMI: 0.0
Process finished with exit code 0
Can you please try the latest version of the code and report back if you still encounter the bug? Thanks!
Hi! Thanks for making this nice tool!
I think the problem still sits there,
feat_selector = mifs.MutualInformationFeatureSelector(method="MRMR",
categorical=False,
verbose=2)
feat_selector.fit(df_X.to_numpy(), y)
Auto selected feature #1 : 49, MRMR : 0.31481619092341173
Auto selected feature #2 : 318, MRMR : -0.06933974432983714
Auto selected feature #3 : 117, MRMR : -0.1471495713561355
Auto selected feature #4 : 50, MRMR : -0.13565179463330096
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-72-3b239f510761> in <module>
----> 1 feat_selector.fit(df_X.to_numpy(), y)
~/softs/miniconda3/envs/solv_py37/lib/python3.7/site-packages/mifs/mifs.py in fit(self, X, y)
208 break
209 MRMR = xy_MI[F] - bn.nanmean(fmm, axis=0)
--> 210 selected = F[bn.nanargmax(MRMR)]
211 S_mi.append(bn.nanmax(MRMR))
212
ValueError: All-NaN slice encountered
Can you help fix it? Thank you.
@danielhomola