Run yhaplo on b38 VCF files
Closed this issue · 1 comments
I have a .vcf file of Y SNPs aligned to b38 reference genome. However, I notice several files in the input directory are based on b37 coordinates. Any advice on how to deal with b38 Y SNPs? Liftover from b38 to b37 first and then run yhaplo? Or other workflow?
Thanks!
I imagine a b38→b37 LiftOver should do the trick.
Alternatively, it looks like ISOGG lists b38 coordinates for all of these SNPs on the spreadsheet linked from this page: https://isogg.org/tree/ISOGG_YDNA_SNP_Index.html
So you could read in the mapping and replace the b37 coordinate values in you local version of yhaplo
's input/isogg.*
files. One caveat is that input/isogg.2016.01.04.txt
has some formatting issues, as it was copied directly from the ISOGG website at the time. That might make it hard to edit. When yhaplo
runs, it cleans and processes this file. So the output file output/isogg.snps.unique.2016.01.04.txt
may make for a better starting point.
LiftOver is probably easier, if that works :)