New Data Integration
cadyyuheng opened this issue · 3 comments
Dear Saturn team,
Say we have some mouse in-house datasets that we'd like to integrate with your mammalian cell atlas under the same embedding. Without re-training with our datasets, is there any quick way that we can find the macrogene values for each our cells? How can leverage the genes_to_macrogenes.pkl file of the mammalian cell atlas together with the count matrix of our own data?
Thanks
You could use the centroids to take a weighted average of expression.
However, I would recommend retraining.
Could you please elaborate on how exactly we can "use the centroids to take a weighted average of expression", in particular the weighted average part? It seems in the manuscript that the macrogene expression values
Thanks!
Yes. Since you are not using these as inputs to a neural network, you can just ignore the ReLU and LayerNorm parts.