SMorettini/CNNs-on-CHB-MIT

why signalsBlock.shape[0] == 50 save

Closed this issue · 6 comments

in DataserToSpectogram.py line 205
if (signalsBlock.shape[0] == 50):
saveSignalsOnDisk(signalsBlock, nSpectogram)

What does 50 mean, is it 50 times of 30 seconds sampling, or is it something else that I don't understand, please forgive me for being a beginner. Thank you very much for your reply

in DataserToSpectogram.py line 205

Actually, it is in line 138 :)

What does 50 mean...?

It's an arbitrary number to save the spectrograms in multiple files. Save all the spectrograms together will create huge files so I decided to save them in groups of 50. Every time I created 50 spectrograms I save them in a file.

Thank you. I added comments to cause the wrong number of lines.
I'm not very good at processing eeg data.I understand that 50 is batchsize, right?
I want to change 50 to 1, so that I can store one time domain diagram at a time. Is that right? How can I implement it? What are the related parts?

50 is the number of spectrograms to save for each files. If you save 1 per file you will have thousands of files. You can use any number but I would suggest not too small(you will have a lot of files) and not too big(you will have a few files but huge ones).
The DataserToSpectogram.py is used to pre-process the data and produce the spectograms.

Thank you.

signals = np.zeros((22, 59, 114))
I have another question, thank you. May I ask what represents 59 and 114 respectively in the createSpectrogram function? What do they mean? How do you calculate it?Is 114 the frequency? What does 59 stand for? Should it be 30 seconds? If 59 is time, 114 is frequency, but I don't see where the STFT is used.
Forgive me, I don't understand how to generate the dimensions of 114, if 114 represents frequency then it should be the dimensions generated from the raw data. Please tell me how this part works. Thank you.

I suggest you to read Creation of the spectrograms section of https://smorettini.github.io/general/2020/07/21/CNNs-on-CHB-MIT/ and also the original paper https://www.sciencedirect.com/science/article/pii/S0893608018301485

  • 22 = Number of channels
  • 59 = It's time. It's 59 because there are two values for each second. I don't remember why is 59 but not 60, maybe if you check the documentation of signal.spectrogram you will find out.
  • 114 = It is obtained by removing some noisy frequency in the code. Originally there are 129 frequencies that are reduced to 114:
    Pxx = np.delete(Pxx, np.s_[117:123+1], axis=0)
    Pxx = np.delete(Pxx, np.s_[57:63+1], axis=0)
    Pxx = np.delete(Pxx, 0, axis=0)

The STFT is computed in:

freqs, bins,Pxx =signal.spectrogram(y, nfft=256, fs=256, return_onesided=True, noverlap=128)

P.S.: I'm sorry if I don't remember all the details but this repo is very old and years passed from last time I work on it.
P.P.S.: Read carefully the material I link at the beginning of this answer, they contains most of the info we used to write the code.