Why is the site coverage of pileup different from that observed by igv？

Question

Why is the site coverage of pileup different from that observed by igv？

haogecat opened this issue a year ago · 2 comments

I used pysam to output the SNPS at locus 22218574 of the reference genome, but the output reads were fewer than what I visualized with IGV. The results of pysam were many fewer reads. Why?
Here is the code:

import pysam

bf = pysam.AlignmentFile("sled.bam", "rb" )
for pileupcolumn in bf.pileup():
    if pileupcolumn.reference_pos+1==22218574:
        for pileup in pileupcolumn.pileups:
            print ("site:%d\tbase in read %s = %s" % (pileupcolumn.pos+1,pileup.alignment.query_name, pileup.alignment.query_sequence[pileup.query_position]))
bf.close()

Below is a visualization of igv at site 22218574。

Sincerely request you, thanks!

Answer 1 · 2023-12-12T03:01:03.000Z

You don't say how many reads were indicated by either pysam or IGV.

Pysam's pileup() uses the same machinery as samtools. (So you may also wish to experiment with samtools and its similar parameters.) Usually lower depth than expected is caused by things like max_depth and min_base_quality, so you should start by adjusting those arguments.

Answer 2 · 2023-12-20T10:00:13.000Z

You don't say how many reads were indicated by either pysam or IGV.

Pysam's pileup() uses the same machinery as samtools. (So you may also wish to experiment with samtools and its similar parameters.) Usually lower depth than expected is caused by things like max_depth and min_base_quality, so you should start by adjusting those arguments.

Thank you very much for your answer, I will try