Challenges of Motif Analysis
Closed this issue · 1 comments
@aarmey during my lab meeting I realized I was trying to solve separate problems with the same solutions whereas they should be addressed individually so I put together this table to summarize what we discussed and added a few points. Let me know your thoughts, especially for problem numbers 3 and 5. I think problem number 4 might not be critical to solve but wanted to share because find this an interesting alternative that solves a relevant problem in kinase specificity.
Problem | Solution(s) |
---|---|
1. Only a within-cluster relative measure of how good an upstream kinase prediction is | Use FD of a good match as a + control (e.g. Kinase inhibitor screening data) |
2. Motifs enriched by amino acids with the same biophysical property may not be realistic | Randomize by position as a - control |
3. Several motifs within a large cluster (e.g. over 1000 peptides) | (A) Binomial method to search for individual motifs** (B) Avoid assigning upstream kinase & focus on what's inside the cluster (KSEA to find enriched kinases) (C) Re-fit clustering method with only this cluster to generate sub-clusters (D) Improve method to be able to fit with a larger number of clusters |
4. PSPL doesn't provide coupling information ("ideal" PSPL motif could be a horrible motif for a kinase) | Phage display exposes a kinase to all possible amino acid combinations of a motif thereby generating a ranking of what peptides are most phosphorylated by a kinase. We would end up with a score for each peptide in a cluster for a given kinase. Phage display is kind of the "new thing" in kinase specificity so in the next years several kinases will have been profiled using this method. If we had a tyrosine cluster we could use phage display profilings currently available as an example |
5. Only few position/residue pairs are determinants of specificity and FD is a global measure | (A) Phage display would address this as well. (B) Only calculate distance of favored residues (C) Find an alternative scoring method. |
6. Currently not showing how using sequence information changes peptide assignments | Heatmap showing assignments, then highlight a particular peptide that changes after using motifs and rationalize why this new cluster is a better fit |
**By binomial method I mean using this strategy to search for enriched motifs within a cluster.
Yes! Exactly what I was thinking, too.
I'd just add that it may not be reasonable to expect that a motif always matches a kinase in complex datasets like CPTAC. A simpler dataset like the inhibitor panel is more reasonable to expect kinase motifs. However, motifs in general could relate to other biological patterns that don't align to kinase regulation.