Results among similarity samples are very different.
Closed this issue · 3 comments
shenzhenzth commented
tianxiongbb commented
The final results include many chimeric artifacts that need further
filtering, i.e. those insertion that only supported by one read pair. How
does the results look like in germline insertions. Depend on your dataset,
you can set a threshold for germline insertions. For example, I would say
insertions with more than 5 supporting reads in a 30x data germline
insertions.
…On Sat, Jan 7, 2023, 4:52 AM shenzhenzth ***@***.***> wrote:
Dear weng-lab,
When I use the TEMP2,
I get the finnal insertion.bed file, but there are huge differences
between similarity samples.
just like this.
[image: 4b1179bec3227eab9b6e1cd3bcd75d8]
<https://user-images.githubusercontent.com/109035781/211144520-cf0d0fb5-b7a1-4638-8c28-87fe55554990.png>
isn't it right or normal?
am I use it correctly?
best wishes
—
Reply to this email directly, view it on GitHub
<#12>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AFPDEDHDXAF3M6KC7CC5KL3WRE4HNANCNFSM6AAAAAATT3BN4Y>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
shenzhenzth commented
Thanks for your reply on this questions!
Thanks a lot!
After filtering as your advice, the counts of insertions have been normal !
But I have another question to bother u,
How can I combine all the samples to One file like .vcf format?
Can I speak Chinese to describe this question again? 🤣
十分感谢您的回复,
冒昧打扰不胜惭愧,
上次根据您的建议过滤后的结果、各个样本之间的差异小了许多,十分感谢、这真的有用。
但是有一个新问题想请教您,😢
我该如何、把每个样本call出的bed文件整合到一起、得到一个vcf格式的文件用于下一步分析🥺
tianxiongbb commented
You might need write a script to convert the TEMP2 output to vcf (please notice that TEMP2 result uses 0-based coordinates like bed format, while vcf uses 1-based). Nevertheless, I am planning to add a script to do this in the next version of TEMP2.