Challenges of Motif Analysis

Question

Challenges of Motif Analysis

Closed this issue 3 years ago · 1 comments

@aarmey during my lab meeting I realized I was trying to solve separate problems with the same solutions whereas they should be addressed individually so I put together this table to summarize what we discussed and added a few points. Let me know your thoughts, especially for problem numbers 3 and 5. I think problem number 4 might not be critical to solve but wanted to share because find this an interesting alternative that solves a relevant problem in kinase specificity.

Problem	Solution(s)
1. Only a within-cluster relative measure of how good an upstream kinase prediction is	Use FD of a good match as a + control (e.g. Kinase inhibitor screening data)
2. Motifs enriched by amino acids with the same biophysical property may not be realistic	Randomize by position as a - control
3. Several motifs within a large cluster (e.g. over 1000 peptides)	(A) Binomial method to search for individual motifs** (B) Avoid assigning upstream kinase & focus on what's inside the cluster (KSEA to find enriched kinases) (C) Re-fit clustering method with only this cluster to generate sub-clusters (D) Improve method to be able to fit with a larger number of clusters
4. PSPL doesn't provide coupling information ("ideal" PSPL motif could be a horrible motif for a kinase)	Phage display exposes a kinase to all possible amino acid combinations of a motif thereby generating a ranking of what peptides are most phosphorylated by a kinase. We would end up with a score for each peptide in a cluster for a given kinase. Phage display is kind of the "new thing" in kinase specificity so in the next years several kinases will have been profiled using this method. If we had a tyrosine cluster we could use phage display profilings currently available as an example
5. Only few position/residue pairs are determinants of specificity and FD is a global measure	(A) Phage display would address this as well. (B) Only calculate distance of favored residues (C) Find an alternative scoring method.
6. Currently not showing how using sequence information changes peptide assignments	Heatmap showing assignments, then highlight a particular peptide that changes after using motifs and rationalize why this new cluster is a better fit

**By binomial method I mean using this strategy to search for enriched motifs within a cluster.

Answer 1 · 2021-03-31T03:04:31.000Z

Yes! Exactly what I was thinking, too.

I'd just add that it may not be reasonable to expect that a motif always matches a kinase in complex datasets like CPTAC. A simpler dataset like the inhibitor panel is more reasonable to expect kinase motifs. However, motifs in general could relate to other biological patterns that don't align to kinase regulation.