Peptides with multiple N-glycan sites
Opened this issue · 5 comments
I noticed that pGlyco does NOT seem to assign one glycopeptide with more than one glycan at a time. For example, the sequence NGTGPCPNVSTVQCTHGIK, the output result from pGlyco only assigns glycans to one of the two sites per spectrum even when the spectrum clearly shows both sites are occupied by respective glycans simultaneously. Can we resolve this issue by modifying the search parameters?
Thank you.
@lindaz711 That is a very good question for pGlyco and other glycopeptide search engines. The answer is "no" currently.
As you found, it is not difficult for search engines to determine if two sites are glycosylated by using "b/y + HexNAc" ions, but the most difficult part is to assign glycan composition to each site. For example, if there are two glycosylated sites at a peptide, say S1 and S2, and the total composition of the glycan is Hex(10)HexNAc(4), there may be many kinds of site-specific glycan combinations:
S1: Hex(7)HexNAc(2), S2:Hex(3)HexNAc(2)
S1: Hex(6)HexNAc(2), S2:Hex(4)HexNAc(2)
S1: Hex(5)HexNAc(2), S2:Hex(5)HexNAc(2)
S1: Hex(4)HexNAc(2), S2:Hex(6)HexNAc(2)
S1: Hex(3)HexNAc(2), S2:Hex(7)HexNAc(2)
......
For HCD, there are few ions can help determine which combination is correct. In these situation, EThcD may provide additional information, and we will support EThcD data analysis in these months.
@jalew188 Thank you for the clarification.
I understand that the calculation of assigning glycans to particular sites increases exponentially in the case of multiple sites being glycosylated simultaneously. But at the moment it seems that pGlyco only outputs one glycan assignment per peptide regardless of the fact that the majority of that particular peptide may be glycosylated at multiple sites simultaneously. Can I trust that kind of assignment even with a low FDR? Is it feasible for pGlyco to output all the possible assignments calculated in the case of multiple glycan sites and score them based on whatever the HCD spectra suggest even though that may lead to a number of possible assignments with similar (inconclusive, middle) FDRs?
Thank you.
@lindaz711 That's a nice advice. Current algorithm does not support this kind of feature, we will consider it in the later version.
For the current version, you may need to write some scripts to locate additional sites that are mismatched by pGlyco.
Hi @jalew188 . I work with Linda and would be the one to "write some scripts to locate additional sites that are mismatched by pGlyco". I'm not sure what you mean, though. If the glycopeptide has already been identified incorrectly as containing a single glycan assignment, how would we write a script to identify to correct glycopeptide with 2 or more sites of glycosylation? Could you please elaborate?
@dbrentw You are right since pGlyco2 is only designed to identify the single-site glycan.
But if the combined glycan composition is in the glycan database, pGlyco has an opportunity to get it, and then the most difficult part is to partition the composition for 2 sites, as I said to Linda.
If the combined glycan composition is not in the glycan database, pGlyco will fail and there is no way to get correct sites, as you said.