chunking error - testing same genes twice
jackhump opened this issue · 1 comments
jackhump commented
my permutation results contain each gene twice:
$ ls *permutations.txt | xargs grep "^ENSG00000015592.16"
LumbarSpinalCord_expression_peer30_chunk152.permutations.txt:ENSG00000015592.16 chr8 27258420 27258420 - 6436 12913 . chr8 27245507 27245507 195 173.024 1.05009 656.504 2.17524e-47 -1.06769 9.999e-05 2.29836e-41
LumbarSpinalCord_expression_peer30_chunk609.permutations.txt:ENSG00000015592.16 chr8 27258420 27258420 - 6436 12913 . chr8 27245507 27245507 195 175.538 1.0437 715.465 2.17524e-47 -1.06769 9.999e-05 1.08018e-41
What's going on? I'm systematically over-reporting the number of significant eGenes found by the pipeline by a factor of 2.
Hopefully switching over to TensorQTL will get rid of the need for chunking and all the headaches that come with.
jackhump commented
fixed by moving to tensorQTL