[request] Show candidate gene lists in context of GWAS SNPs
mariacos opened this issue · 8 comments
This kind of track would show specific genes from a list (e.g., differentially expressed, mouse homolog has knockout phenotype, monogenic associations from OMIM, presence on a list of predicted effectors) in the context of genetic associations.
Image moved to Google Drive, shared w/ Andy and Maria - Ryan.
This is an unpublished, confidential image from a collaborator. He describes it as:
"This is the plot with the bed tracks corresponding to instances where human SNPs intersect human genes for which we have identified mouse orthologues that are differentially expressed in different bone cell types. Please note we are not attempting to map human snps to conserved human-mouse gene domains or anything fancy like that. This is just meant to provide an intuitive way of prioritising candidate causal genes on the basis of having a GWAS signal overlapping a gene that has a mouse orthologue that is differentially expressed in a particular bone cell.
Light grey – denotes a SNP intersecting the gene that has BMD association p-value > 5E-5.
Dark grey – denotes a SNP intersecting the gene that has BMD association p-value between 5E-5 and 6E-9,
Blue - denotes a SNP intersecting the gene that has BMD association p-value < 6E-9. (Revised GWAS significance as estimated by Kemp et al Nature Genetics 2017)
The rationale behind this representation is that our paper provides evidence that the transcription profiles of certain bone cell types are enriched for bmd and height associated human orthologues, and for monogenic disease genes that result in abnormal bmd / growth/height."
Image moved to Google Drive, shared w/ Andy and Maria - Ryan.
Our collaborator's comment:
"This is my dream lz plot that further annotates genes that cause monogenic skeletal diseases (of which there are 461 and counting) – denoted by a human skeleton, and also instances where a knockout mouse models with abnormal skeletons exist for the corresponding human orthologues (n~1300). I can provide you with these lists of ensemble ids as I have them on hand. Marrying up all this data will effectively summarise most of the work we have done in the paper and will provide a one stop shop that both mouse and human genetics people can use.
Interesting: thanks for the sample images! (I was halfway through writing a followup question when many of the answers magically appeared on screen. That's the dream, right?)
FYI, you mention the image is confidential? This issue tracker is public; let me know if that's a concern.
A major part of this task would be connecting the gene lists to the GWAS in a meaningful way. To help me read the proposed figure, could we possibly drill into the meaning of each row?
- It looks like the key track is the second one. A big aspect of this plot will be understanding how many candidate genes and lists there are. Am I correct that the file is formatted as "several candidate gene/variant lists, with each list generated for a specific tissue"?
- Some of the light grey boxes in this track are smaller than a gene; others seem much wider than a single SNP. Is there more than one kind of interval/ length scale in the boxes on this track?
- What is the third row? (labels with arrows) Is this meant to be analogous to, say, the LZ.js GWAS catalog hits track? ("variants with strong claims of significance for any trait")
- What is the fourth row? (human and mouse icons- it's not clear how they map to the x-axis). Since this is part of the "Dream plot", understanding this box might point to another way to showcase connections between tracks, to help convey meaning.
- What are the red boxes around genes in the last row, where introns and exons are displayed? I assume there is a special reason why those genes are called out for attention?
I can imagine that each researcher would have their own candidate gene lists as they develop a specific hypothesis in context of known data. Do we envision the portal taking responsibility for ingesting and listing these tracks from a preset list? Or is the preference to let people add their (hopefully well-formatted) BED files from the browser?
Also, thanks again! I've updated the issue title based on our discussion yesterday. If your new information provides better clarity, feel free to change the title as appropriate.
I moved the two images into Google Drive, shared specifically w/ Andy and Maria. If anyone else needs to view we can add as needed.
Thanks Ryan. After a bit of searching, I've added those files to our internal LocusZoom team folder on Google Drive, to help me keep track later.
A prototype was released in December 2021 with no word back. Many of the required data join improvements were merged in LZ 0.14.
Closing this issue unless revisited.