rgcgithub/regenie

Feature request: E-value statistics

jerome-f opened this issue · 1 comments

@joellembatchou I wanted to put this in on Regenie admin radar.
tiny_review_e_value_and_e_process.pdf
Attached PDF describes the E-value statistics and its cool properties when applied to different testing scenarios. I thought this will be particularly useful for gene burden test (specifically GENE_P) which uses Cauchy combination to combine P-values across multiple tests to arrive at the final P-value per gene. This is a good method but makes some assumptions on the underlying distribution dependencies. E-value on the other hand can be just averaged across the combinations and inferred directly. E-value is also related to log likelihood and so the interpretation based on Table 1 is more easier and correcting for multiple testing and FDR are also easier. Although Bonferroni corrections are robust they are also stringent and with GENE_P where one has to control for no. of genes x no. of allele frequency bins x no. of masks x no. of tests to correct for FWER could lead to missed discoveries so to speak. Reporting the E-value in addition to P-value would be a overall beneficial outcome.

Best
Jerome

Thank you Jerome for the reference and will look into it as potential enhancement.

Cheers,
Joelle