In search of the precursorMZ values for MS2
chufz opened this issue · 13 comments
Hi,
I am in search of a slot for the precursorMZ values for MS2 data, however, I could not find it.
Is there any change to read out these values for the scans? Would be a piece of necessary information for any spectral library application.
Best,
Carolin
Good day there,
I understand correctly that you are working with DDA data?
Best,
Hi Matteo,
yes it is PASEF DDA data, and i will need to read out the precursor Masses.
See also issue rformassspectrometry/MsBackendTimsTof#18
Would be happy to have this implemented in opentims
OK, but I have to contact my collaborator: I don't know now from which language you want to access this information and how exactly you want the output format to look like. In fact, this looks like a pretty simple thing to code oneself in Python to exactly meet your needs: do you need some help with that? If so, how would you like your output to look like exactly?
I have now some DDA data to play with and am in contact with people from Bruker to ask them if I get the format right. Stay tune.
Thanks a lot, Matteo :)
I think it is not so simple.
The problem is, that the PasefFrameMsMsInfo describes only the positions of the fragments in the raw data.
The precursors that these correspond to have been fragmented.
Obviously, there is some sort of expectation that a frame or two before the fragmentation was triggered, the MS1 precursor data should have been acquired and analyzed by the instrument, as this is the underlying principle of the Online Parallel Accumulation-Serial Fragmentation.
But the fragmentation scheduling algorithm is apparently more complicated than that.
MS2Frame ScanNumBegin ScanNumEnd
Precursor
1 66 770 796
1 67 770 796
1 68 770 796
1 70 770 796 # See, there was a hole in one precursor.
1 71 770 796
1 72 770 796
1 73 770 796
2 98 814 840 # Here there was some break for switching or something
2 99 814 840
2 100 814 840```
I could imagine that you would be interested not simply few frames with MS1 data, but, likely, all of them in the current dataset. This would call for some form of clustering, which is likely performed by MaxQuant and Co.
I will organize one more meeting to clear this up with the Bruker guys.
I mean, the main problem is, that without clustering I do not know how many frames should I extract from the MS1 signals to give the best possible answer to the question about the identity of precursors that were fragmented. It could have been that when the signal was still rising, the algorithm on board of the instrument did not schedule data for fragmentation. Likely, when it was going down, this could have been again the case. Also, the answer would vary even more in presence of coeluting ions that would make it more interesting for the algorithm to schedule their fragmentation (marvelous DDA at its full capacity). So, the big question to you is: if you are happy with any estimates of MS1 precursors, then you already have them in the table (you can translate frames to retention times with .frame_to_retention_time method and scan to inverse ion mobility by .scan_to_inv_ion_mobility methods of the OpenTIMS object). These are merely statistics, but correct ones and obviously these precursor signals cannot be observed as they were fragmented. If you want more, you should include raw data from the neighbourgood, and for that one needs either clustering, or at very least extraction of the very close-by sections of data.
If one would like to replicate the conversion by e.g., msconvert how would you proceed?
E.g., a filter like:
https://user-images.githubusercontent.com/5803621/155004773-b72aac33-107c-4546-aeca-4d2fe9f7424e.png
that generates approximate precursor values (including m/z)
I don't think I know what MSConvertGui does. Where can we find it?
MSConvert is available here:
https://proteowizard.sourceforge.io/download.html
but I did not find a good documentation for that part.
Browsing the source code indicates that there might be some pointers:
https://github.com/search?q=repo%3AProteoWizard%2Fpwiz+scanSumming&type=code
https://github.com/ProteoWizard/pwiz/blob/55889be8e5f48ba44640bf0d93f00be3f4b0824a/pwiz_aux/msrc/utility/vendor_api/Bruker/timsdata_cpp_pwiz.h
I think I am a bit lost:
you are writing on a post about precursor position for MS2 fragments.
Does this filter have anything to do with it?
Precursor summing has been broken in msconvert for a while now and no one did replay to issue I did report: ProteoWizard/pwiz#2566.
@MatteoLacki, if you want to implement spectra summing of the "same" precursor, then the latest Bruker API does include methods for extract spectrum across all frames for the same Precursor ID. Or there is method to get "quasi-profile" spectrum, which will return you intensities of MS/MS peaks on fixed m/z grid and spectral summing can be done externally.
But if I understand @timosachsenberg question correctly, then they don't want summed spectrum, but just "aggregated" precursor information, in that case it can all be done from information present in sqlite tables.
Thanks for the insight. I have to admit that I still need to catch up on the detail - for my MS/MS identification use case, the simple summing was sufficient.