cantinilab/scconfluence

Question about celltype annotation transfer post integration

Closed this issue · 2 comments

Hi,

Thank you for developing this novel method.

Our experimental setup is as follows:

  • scRNA (BD Rhapsody) with celltype annotation and scATAC (10x) for mice with no treatment
  • scATAC (10x) for mice with treatment

The samples were sequenced in multiple batches, and we expect differences in the cell type population observed between the treated and untreated mice. Hence, our choice to use your method to integrate the data sets.

Apart from integrating, we also wish to transfer the known celltype annotations from the scRNA data set. I am unsure on which would be a good approach.

  1. Perform a clustering of the latent embeddings and use only the scRNA expression to re-annotate those clusters or
  2. using a majority vote approach of already available celltypes found in the latent embedding clusters?

What would you suggest?

Thanks and regards

Hi,
Thank you for your interest in our method.

In the first option by "use only the scRNA expression to re-annotate those clusters" do you mean annotating clusters manually based on the expression of known scRNA-seq markers without using available cell type annotations?

If that's the case then I would probably suggest using the second option unless you believe the known celltype annotation are not precise enough and you could obtain finer results with your prior knowledge of marker genes.

However, I think there's also a third option which is to transfer annotations from the scRNA to the scATAC by using knn classification (k=10 should work fine) on the shared latent embeddings. Indeed, I would highly recommend this option especially since you might obtain different clusterings for different values of the resolution parameter which might make annotating the inferred clusters a bit risky (instead of transferring annotations for each cell independently). Then based on the results you obtain from this transfer you could have a better idea of what resolution to use for your clustering in case you want to dive a bit deeper into the heterogeneity of your dataset.
Let me know if this is clear.
Best,

Thanks for your response. You are right about my intent in using known marker genes to re-annotate the clusters in the first approach. The existing annotated BD Rhapsody scRNA data, I mentioned, is in fact our own data and we annotated it using known marker genes after evaluation numerous resolutions and sub clustering. The third option you suggested is cleaner, indeed. I will try that.