rformassspectrometry/Spectra

filterMzValues

Closed this issue · 10 comments

Hi!!!

I open this issue since I've noticed that the filterMzValues function is not working as I expected... Here an example:

I have a MS2 spectra with some noise. These are its m/z values:
> mz(myms2)[[1]]
[1] 52.34714 52.34981 52.35789 52.36570 52.36932 52.37316 52.41703 67.24783 120.08054

I want to delete all m/z values around 52, therefore I run the code in the following way:
> myms2filtered <- filterMzValues(myms2, 52.3, tolerance = 0.3, keep = FALSE)

However, when I check the m/z values of the new object I realize that the algorithm only excluded one of them.... (the first one, even if I was thinking that with that code I was able to delete also the first 7...)
> mz(myms2filtered)[[1]]
[1] 52.34981 52.35789 52.36570 52.36932 52.37316 52.41703 67.24783 120.08054

Is it possible to delete "in one shot" all the first 7 m/z values, that is 52.3 +- 0.3?

Thanks!!! :)

Thanks for reporting Mar - yes, it makes sense to delete/remove all peaks with matching m/z and not just the best matching (which I guess is what is happening). I'll look into it.

Indeed, since we're using closest(input_mz, spectra_mz) in the code (with input_mz being the m/z value(s) defined by the user and spectra_mz the m/z values of a spectrum) only a single peak will be identified for each input m/z in a spectrum. To find all peaks in the spectrum that mach any of the input m/z values (given ppm and tolerance) the order of the parameters has to be reversed, i.e. closest(spectra_mz, input_mz). This will have an impact on the performance (slower), but will deliver the expected results.

Thanks @jorainer!!!
And what about using the function between() instead of closest()? Could be an option? Or using this one (i.e., between()) will be even slower than using closest(spectra_mz, input_mz)?

Note that you could use the filterMzRange that in fact uses the between function. A solution on between would be difficult/slow if the input parameter mz is of length > 1.

Can you please install the fix and let me know if it works?

BiocManager::install("RforMassSpectrometry/Spectra", ref = "RELEASE_3_16")

ah!! Then I understood wrongly how filterMzRange() is working.... Then I guess that it is possible to use this function also with the argument keep = FALSE, isn't it?
I can try both filterMzRange() and also install the fix and see what's happening! I'll let you know.

So, filterMzRange keeps all peaks that are within the provided upper and lower m/z value (can only be a single range). At present it does not have a parameter keep - but thinking that over it might actually make sense, so basically allowing to either keep all peaks within the range or to remove them...

I'll quickly add the parameter keep also to filterMzRange then you can test it :)

You would need to install the current devel branch to test the new filterMzRange with the keep parameter.

BiocManager::install("RforMassSpectrometry/ProtGenerics")
BiocManager::install("RforMassSpectrometry/MsCoreUtils")
BiocManager::install("RforMassSpectrometry/Spectra")

Super! Now it's working :)
By now I think that I'll use the function filterMzValues() with an specific m/z value + tolerance, but if in the future I need to work with something faster (maybe when I'll work with a higher amount of data) I'll try to use the function filterMzRange() with an specific m/z range.
Thanks a lot!!