yukiteruono/pbsim3

Introducing faux heterozygosity

casparbein opened this issue · 1 comments

Hi,

I have been using pbsim3 to simulate HiFi read data and reassemble it to get acquainted with long read assembly. An issue I encountered is the relative cleanness of the simulated data. I used the error model, which introduces random errors in simulated reads based on real PacBio reads, right? Is there also a way to simulate a given degree of heterozygosity with pbsim3? Purely homozygous simulated reads are rather easy to assemble, so simulated reads with a certain degree of faux heterozygosity might be more close to real data.

Thanks in advance!

Thank you for your using PBSIM.
PBSIM3 cannot simulate reads from polyploid.
To simulate heterozygosity in diploid, you first introduce mutations into the reference genome sequence to generate two haploid genomes. Reads are then generated from each haploid genome using PBSIM3.
I use a house-made tool to randomly introduce mutations into the genome, or introduce known varants.