natsuhiko/rasqual

Expected distribution of Pi

Closed this issue · 2 comments

Hello!

In the supplement of Kumasaka et al 2016, you have figures depicting the expected and estimated distributions of RASQUAL parameters from simulations (Supplementary Figures 8-10).

I am really curious about the expected distribution of Pi (the "genetic effect"). If I understand Pi correctly, and assuming negligible mapping error and reference bias, I would have expected the distribution of Pi to be symmetric. In other words, I would expect the alternative allele at a common variant to be equally likely to increase (Pi>0.5) or decrease (Pi<0.5) expression/accessibility. But in the figures, it looks like there are more features where Pi<0.5 than Pi>0.5

Do you think this is due to some underlying biology (for example, alternative alleles tend to be minor alleles, and alleles that are less common in the population tend to be associated with less expression/accessibility than their higher frequency counterparts)? Or is the excess in P<0.5 (relative to P>0.5) driven by reference or mapping bias? Or is there another explanation I'm missing?

Thanks for your thoughts!
Cassie

image

Hi Cassie,

Unfortunately it is likely to be the reference bias. Pi<0.5 means more reads mapped on the reference allele than the alternative.

Although RASQUAL tries to adjust the reference mapping bias, it is not always perfect.

Best regards,
Natsuhiko

Ok, I see. Thanks very much for the reply! We will keep this in mind when interpreting output.