rgcgithub/regenie

Fraction of genes have no gene_p results

Mathias0077 opened this issue · 3 comments

Hi Joelle,

I hope you're well. I'd be grateful if you could please help me with a problem in gene based burden testing using the --rgc-gene-p command.

The output to the console reports no issue or problem during the run but in the results file we find the gene_p statistics only for a fraction (~60%) of all genes. I paste you an example output of results (i, gene SPTLC1) with gene_p and (ii, gene SUSD3) without gene_p results.
While there is no output of a problem on insufficient data/ rare allele counts or similar, I wonder, what happened and if there might a way to get information about the problem to potentially solve it and to have the gene_p test for all genes? By the way, we cannot observe a pattern, that for example, only for genes with a low number of rare variants the gene_p is not in the results file - this also happens for genes with a higher number of rare variants compared to other genes with a lower number of rare variants and gene_p in the results. And we do the analyses for all autosomal genes for a balanced (1:1) cases/ control outcome.

Any help is highly appreciated.

Thank you, best Mathias

grafik

Hi Joelle,

thank you for looking into this. The summary stats are those posted in my initial message. I added the logfile to this thread. There are 94 and 35 variants in the genes and the exepected information is provided for both genes:

 -reading in genotypes, computing gene-based tests and building masks...done (xxx ms) 
 -computing association tests...done (xx ms) 
 -computing joint association tests...done (x ms) 

the logfiles do not report an error this. I hope this information is helpful.

Best,
Mathias

tocheck.log

Hi,

Can you re-run just for gene "SUSD3" including the option --debug? Please send both the output from stdout & stderr.