immunogenomics/cna

Minimum number of samples

torgheh opened this issue · 1 comments

Hi,

Thanks for your amazing work and the easy to use interface you have provided for your method. I wanted to know your opinion about the number of samples per condition necessary to get meaningful results from CNA, if there is any. It seems like with 3 sample per condition I never get significant ( < 0.05) FDRs.

rumker commented

Thank you so much for your kind words! I'm glad to hear you've found CNA easy and useful. There is no specific sample count we can recommend for a given dataset because the opportunity to detect an association is also dependent on the strength of the effect in your data. We do generally recommend N>10. The smallest dataset we've tried was N=12, which contained a potent and detectable case-control contrast, and in general CNA's statistical power increases with sample size. (Both of these analyses are in the CNA paper if you're interested to learn more!) In your shoes, I would look at the full table of local association testing results stored in the cna.tl.association result object. If there are populations that pass an FDR <7% threshold, for example, that might bode better for investment in increasing your sample size, than if there are no local associations with FDR <25% at your current sample size.