Extract new clinical gene sets from the literature
kmaynard12 opened this issue · 6 comments
- snRNAseq MDD https://pubmed.ncbi.nlm.nih.gov/32341540/
- snRNAseq ASD https://pubmed.ncbi.nlm.nih.gov/31097668/
- DOUBLE CHECK: bulk sex differences SCZ https://www.biologicalpsychiatryjournal.com/article/S0006-3223(21)01180-X/fulltext
- snRNAseq SCZ https://www.medrxiv.org/content/10.1101/2020.11.06.20225342v1.full
- snRNAseq and spatial SCZ https://www.biorxiv.org/content/10.1101/2020.11.17.386458v2
- Literature search for other new datasets that have been published since 2020 (single cell or bulk)
- Find relevant supplementary tables for differentially expressed genes between cases and controls (and by cell type for snRNAseq)
- Extract gene lists and decide a cut off value for significant genes to include for registration with spatial data @lcolladotor might be able to help with assigning appropriate cut off (might also be relevant for @shkwon17 for AD project)
- how to use a gene set when they don't provide ensembl id. maps_ids function?
- read ASD study I added and figure out they generated their gene sets. Do they have them by cluster and could we the make an entire separate heat map comparing out data to theirs.
- drop double white matter clusters and meninges cluster?? Does the enrichment use the top 100 genes and should we use more?
@abspangler13 where where you doing this and how are along are you?
I performed it for the k= 9 data set against all of the datasets from the pilot study and two new datasets that I added. Here's the code for the two new datasets I added as well as some comments about two sets we were interested in adding.
@kmaynard12 is going to work on this. @lahuuki, I wrote https://github.com/LieberInstitute/spatialDLPFC/tree/main/code/analysis/10_clinical_gene_set_enrichment in such a way that you would need to make 2 new scripts. One for extracting the gene IDs (ENSEMBL IDs) from the different tables @kmaynard12 will select, then another one for computing the odds ratio + making the heatmaps.
@kmaynard12 are there other case/control snRNA-seq datasets beyond the PEC ones we should be looking at?
Related to https://jhu-genomics.slack.com/archives/C01EA7VDJNT/p1673285746357859
@lahuuki I think that we can close this issue, right?