CountsToCPM after extracting markers
randel opened this issue · 2 comments
Thanks for the new approach! In ReferenceBasedDecomposition
, if markers
are provided, only sc and bulk data of those markers are taken to CountsToCPM. Does it make sense to calculate CPM for (say hundreds of) marker genes only? Or it makes more sense to calculate CPM for all genes and then take the subset of markers?
Lines 298 to 315 in ef5bae0
Hi @randel, thanks for your interest in our method and the great question!
I can see how this is an issue for cases like the extreme example of only having one marker gene (every sample would have values of 0 or 1,000,000, including the reference). I am assuming that this issue becomes less significant as more marker genes as used; however, I will look into switching the order of operations here to avoid this issue from popping up. Thanks for pointing this out! I'll close this issue after I've finished testing.
added an option old.cpm
to change the ordering. kept as an option for replicating older results. thanks!