nbara/python-meegkit

Question on dss_line() usage

adswa opened this issue · 7 comments

adswa commented

Hi, its awesome that there is an implementation of the ZAPline algorithm in Python! I'm trying to use it, but I somehow fail to remove the noise components with it. I'm convinced that I must be doing something obvious wrong - any idea what it may be?

I'm using Elekta MEG data (322 MEG channels, after Signal Space Separation) that show an artifact at 60Hz stemming from a presentation display.
Here is its PSD:
psd022

Here's the code I am running:

>>> import mne
>>> from meegkit.dss import dss_line
# read in the data with mne-python
>>> raw_fname = Path(datadir) / f'sub-{subject}/meg' / f'sub-{subject}_task-memento_proc-sss_meg.fif'
>>> raw = mne.io.read_raw_fif(raw_fname)
>>> raw.load_data()
>>> data, artifact = dss_line(raw._data.T, fline=60, sfreq=1000)
Power of components removed by DSS: 0.00

Putting the data back into the MNE Raw Object and visualizing the psd plot shows an almost identical profile, with the 60Hz component being pretty much unaffected.
after_dss

Any idea what I might be doing wrong here?
Thanks much in advance!

nbara commented

Hi @adswa ! Yeah clearly something is wrong here.

Have you tried increasing the number of components to remove nremove to see if it changes anything? I haven't really had time to test zapline with MEG data (especially a combination of grad+mag).

What are those huge peaks around 300 hz ? Those are no 60hz harmonics.

If that doesn't work, I'd be interested to have a look at the data myself (if you don't mind sharing a file, or a part of it).

adswa commented

Have you tried increasing the number of components to remove nremove to see if it changes anything? I haven't really had time to test zapline with MEG data (especially a combination of grad+mag).

I have, and it doesn't really change things. I do run into IndexErrors with nremove greater 2 (see the traceback below). I assume this means that it doesn't find more than 2 components?

IndexError                                Traceback (most recent call last)
<ipython-input-35-e026a949c44c> in <module>
----> 1 data, artifact = dss_line(backup._data.T, fline=60, sfreq=1000, nremove=3)

~/env/magma/lib/python3.7/site-packages/meegkit/dss.py in dss_line(X, fline, sfreq, nremove, nfft, nkeep, show)
    219             xxxx[..., t] = xxxx[..., t] @ todss[:, idx_remove]
    220     elif X.ndim == 2:
--> 221         xxxx = xxxx @ todss[:, idx_remove]
    222 
    223     xxx, _, _, _ = tsr(xxx, xxxx)  # project them out

IndexError: index 2 is out of bounds for axis 1 with size 2

What are those huge peaks around 300 hz ? Those are no 60hz harmonics.

They come from head position indicator coils for motion correction to reconstruct the subjects head position. From skimming through the code, I thought the low pass filter that I think happens within

xx = smooth(X, sfreq / fline)

would result in them being ignored; I could try again with low-pass filtered data to see if that makes a difference.

If that doesn't work, I'd be interested to have a look at the data myself (if you don't mind sharing a file, or a part of it).

Sure, thanks much for taking a look! The data is quite large, so I've cropped a 500s snippet (which still is quite a chunk). You can download it here: https://we.tl/t-9mLnVNLHDQ (the link will be invalid after a short while)

I haven't really had time to test zapline with MEG data (especially a combination of grad+mag).

I quickly tried using only one type of channels (mag), but didn't have much success either:

before_dss
after_dss

nbara commented

I'll have a look. There's at least a couple of reasons I can think of.

Btw, just a quick comment, raw._data.T is probably not the best way to access the raw data (even though I'm guilty of using it myself sometimes), as it will include all kinds of system and misc channels (on top of the grad and mag channels).

raw.get_data(picks=['meg']) is probably safer.

adswa commented

Thanks a lot for the tip!

nbara commented

I can confirm that that was part of the problem. The other problem seems to be that your 60hz peak is not really at 60hz. Using 61Hz works well I think.

Here's the code I used:

import mne
import matplotlib.pyplot as plt
from meegkit.dss import dss_line

raw = mne.io.read_raw('./examples/sub-xxx_task-memento_proc-sss_meg.fif', preload=True)

# First problem : be careful how you pick your data from Raw objects
print(f'raw._data shape: {raw._data.shape}')
data = raw.get_data(picks=['meg'])
print(f'raw.get_data() shape: {data.shape}')

raw.crop(60, 260)  # we don't need that much signal

# Only take MEG channels (same as raw.get_data(picks=['meg']))
meg_ch_idx = mne.pick_types(raw.info, meg=True)
data = raw.get_data(picks=meg_ch_idx)
# Second problem: the peak seems to be closer to 61 hz
clean, artifact = dss_line(data.T, fline=61, sfreq=1000, nremove=10)

# Plot before/after
f, ax = plt.subplots(2, 2)
raw.plot_psd(ax=ax[0, :], show=False, fmin=1, fmax=150)
raw._data[meg_ch_idx] = clean.T  # put clean data back in raw
raw.plot_psd(ax=ax[1, :], show=True, fmin=1, fmax=150)

Screenshot 2021-10-15 at 16 50 28

adswa commented

Oh, wow... I should have checked 🤦 - thanks so much for taking a look and helping out, really appreciated!

nbara commented

Sure, happy to help!