alek0991/iSAFE

Multiple iSAFE peaks

psytky03 opened this issue · 3 comments

Dear Ali,
May I ask how should I interpret the results when a locus contains multiple peaks? I expected to see a signal close to the iHS peak (there was only 1 iHS peak), but several iSAFE peaks were seen in the flanking upstream/downstream regions, does this indicates false positives of iHS?

Thanks
Xiaoxi

Dear Xiaoxi,
The short answer is that iSAFE has higher resolution and multiple peaks can be due to multiple selective sweeps.

iHS is a test to detect a region under selection (not pinpointing the favored mutation) and it does a great job to detect ongoing selective sweeps. However, iSAFE is designed to tackle this issue and pinpoint the favored mutation. Although, iSAFE is also shown to be very powerful to be used as a test of selection (i.e. find the region under selection). Please also check #6 (comment) which is relevant to this topic.

Please let me know if you have any other questions.
Best,
Ali

Hi, Ali,
Thanks for the kind explanations. Now I could understand better. For my analysis, I first run the genome-wide iHS scan using the microarray data, then for those significant loci I attempted to use iSAFE to further dissect the signal and pinpoint favored variants based on an independent WGS dataset. Will it be good if I limit the input variants of iSAFE to those in LD with the lead iHS variant (e.g R2>= 0.7)? As I wanted to know which variant in the given haplotype is responsible for the selection. The other question is how should I define the range for iSAFE? is there a recommendation, for example, 500 kb flanking of the variant with the highest iHS signal?

Best,
Xiaoxi

Dear Xiaoxi,
My apologies for the late response.

For separating multiple selection event at the same locus I don't recommend removing variants in LD with the top variant because it can impact the pattern of haplotype homozygosity in an unpredicted way. However, there are some ways to deal with multiple selection events at the same locus and you can take a look at Chapter 5 of,
Akbari, A. (2018). Looking into the past: Identifying genetic mutations and introgression events that shaped human adaptation

In case you don't have access you can download it from here. In Figures 5.4-5.6, It goes through a case study of multiple selection at the same locus that you might find it useful for your case.

500kbp flanking region is reasonable. However, If you are dealing with a high LD region, like LCT locus in European, I recommend a larger window size like 1-3Mbp.

Please let me know if you have any questions,
Best,
Ali