natsuhiko/rasqual

Direction of fragment count and allelic imbalance

kwcurrin opened this issue · 6 comments

Hello,

Does RASQUAL expect that the haplotype with higher between-sample fragment count also be the haplotype with more reads in the allelic imbalance test?

Thanks!

Kevin

Hi Kevin,

Yes, we expect the high-expression haplotype carries more reads at each feature SNP.

Best regards,
Natsuhiko

Thanks!

I wonder if TF footprints can violate the assumption that the fSNP allelic imbalance shows the same direction as the rSNP fragment count. In the original DNase QTL paper from Degner et al.:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3501342/

they show how the genotype with increased fragment count can actually have a dip in accessibility right at the SNP because of footprinting:
"One intuitive mechanism for dsQTLs is that these may be caused by variants that strengthen or weaken individual transcription factor binding sites, thereby changing transcription factor affinity and local nucleosome occupancy (20-22) and hence DNaseI cut rates. Consistent with this model, an aggregated plot of DNaseI sensitivity at dsQTLs shows a distinct drop in chromatin accessibility around putatively causal SNPs that is reminiscent of transcription factor binding footprints, especially in the genotypes associated with high sensitivity (15-17).

Does RASQUAL penalize an association if the rSNP and fSNP show different directions?

This seems like a unique feature of chromatin accessibility data. For gene expression and ChIP-seq data, it would make sense for the direction of between-sample fragment count and fSNP allelic imbalance to be the same.

Hi Kevin,

I think you mix up DNaseI "cut site" and actual sequenced read. Although DNaseI cut frequency is depleted at the footprint, the footprint is usually very short. When you use 50-75bp sequenced read, you actually sequence the footprint and dsQTL SNP location. Therefore you still see the allelic imbalance at the dsQTL location. This is because we can use RASQUAL to map chromatin accessibility QTLs using ATAC-seq in the paper.

Best regards,
Natsuhiko

Hi Natsuhiko,

That is a good point. However, I do worry about particularly strong footprints dampening allelic imbalance, or possibly reversing direction in extreme cases. I could imagine a case where the footprint is 15 or 20 nt long and the dip in signal is large compared to the other allele. If read length is 50nt, this 15-20 nt footprint is a substantial fraction of the read coverage.

However, I do admit that this is probably not common. Most TFs don't even show footprints:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5530758/

I also worry about cases where read coverage may not be very uniform outside of the motif, so that the footprint is not very well offset by the increase in flanking accessibility.

Kevin

Hi Natsuhiko,

I spoke with a collaborator who extensively studies footprinting and he agrees with you on this. So I will close the issue.

Thanks!

Kevin