martinjzhang/scDRS

Interpreting scDRS score when comparing cell-type subclusters

HyeonbinJoHCho opened this issue · 1 comments

Hi,

thank you for providing such a fantastic tool! It’s exactly what I needed for integrating GWAS summary data with scRNA-seq data.

While using this tool, I encountered a question regarding the interpretation of the result.

I have scRNA-seq data of normal cells from healthy controls and disease cells from patients.
My focus is on epithelial cell clusters, so I have filtered them from the dataset.

To enhance power and reduce the bias of tissue origin, I used cells from both normal and disease tissues and added a covariate for "tissue" information.
I found only one specific epithelial cluster reached statistical significance in scDRS statistics.

But contrary to our previous expectations, the cluster that was enriched turned out to be composed of normal cells rather than disease cells.

I understand that the scDRS score is very useful for prioritizing disease-related cell types among various types such as T cells, Myeloid cells, Neurons, and Hepatocytes, due to their distinct expression patterns.
However, I am uncertain about the interpretation when comparing clusters within a specific cell type because they have similar gene expression and only a subset of genes will be differentially expressed. Moreover, it is possible that the expression of disease-related genes identified by GWAS could be down-regulated in the disease state.

I think variants found in GWAS will disturb gene expression and high-ranked genes from MAGMA will show lower expression in disease than expression in normal.

Could you help me understand how to interpret the results in this context?
Any insights or recommendations would be greatly appreciated.

Thanks a lot for your help!

Best,
Hyeonbin Jo

Hi,

But contrary to our previous expectations, the cluster that was enriched turned out to be composed of normal cells rather than disease cells.
I understand that the scDRS score is very useful for prioritizing disease-related cell types among various types such as T cells, Myeloid cells, Neurons, and Hepatocytes, due to their distinct expression patterns.
However, I am uncertain about the interpretation when comparing clusters within a specific cell type because they have similar gene expression and only a subset of genes will be differentially expressed. Moreover, it is possible that the expression of disease-related genes identified by GWAS could be down-regulated in the disease state.
I think variants found in GWAS will disturb gene expression and high-ranked genes from MAGMA will show lower expression in disease than expression in normal.

scDRS effectively detects subclusters of disease-associated cells within a given cell type, making your analysis appropriate.

Here's my interpretation of the results: The disease gene set includes genes that are crucial for the relevant pathways and gene functions of the disease. Maintaining normal expression levels of these genes is important for keeping an individual healthy. This is why the disease genes are specifically and highly expressed in a healthy epithelial cell population. The epithelial cells from the disease sample may be dysfunctional due to low expression of these genes, which is why scDRS does not detect them.

In our analysis of case-control datasets, scDRS may identify cells from both healthy and disease samples as relevant. Here are a few examples:

  • scDRS detects T cells in disease samples as relevant to IBD because the activated T cells, which are disease-associated, are more prevalent in the disease samples. This is an example where the disease sample contains unique cell states detected as related to the disease.
  • scDRS identifies stem cells from young samples (but not from old samples) as associated with a disease (I forgot which disease). In this case, stem cells from the young samples are functional, with the disease genes highly expressed. However, the cells from the old samples may not be functional. Although disease-relevant cells are present in both young and old samples, the ones from the old samples are abnormal, so scDRS only detects the stem cells from the young samples as associated.