vibansal/HapCUT2

HapCutToVcf result deal

Opened this issue · 0 comments

Hi,

I'm trying to use the HiC, PacBio and 10X data to phase one chromosome which is highly chemeric in my genome.
The steps as followings:
##10x##

extractHAIRS --10X 1 --bam ./hic_separated/10x.REF_chr7.bam --VCF chr7.vcf --out 10x_unlinked_chr7_file
python3 HapCUT2-1.3.1/utilities/LinkFragments.py --bam ./hic_separated/10x.chr7.bam --VCF chr7.vcf --fragments 10x_unlinked_chr7_file --out 10x_linked_chr7_file

##hic##
extractHAIRS --HiC 1 --bam ./hic_separated/hic.REF_chr7.bam --VCF ./chr7.vcf --out hic_chr7_file

##pacbio##
extractHAIRS --pacbio 1 --new_format 1 --ref ../../00.data/phyllodactylus_wirshingi.ZW.fasta --bam hic_separated/pacbio.REF_chr7.bam --VCF chr7.vcf --out pacbio_chr7_file

##merge##
cat 10x_linked_chr7_file hic_chr7_file pacbio_chr7_file > all.chr7_file
HAPCUT2 --fragments all.chr7_file --VCF chr7.vcf --output chr7.haplotype.hap --hic 1 --htrans_data_outfile chr7.haplotype.htrans_model

Is there a problem with the above steps?

##get fasta##
java -jar fgbio-1.3.0.jar HapCutToVcf -v chr7.haplotype.hap.phased.VCF -i chr7.haplotype.hap -o chr7.final.vcf

now I have obtained the chr7.final.vcf by block file and phased.VCF file and I want to phase the chr7 into two fasta.
Should I use GT value (the GT feild column of last column of VCF) to extract or ?? (The "HapCUT2-1.3.1/outputformat.md" shows that "0" is present ref, "1" is represents variant allele, the col2 is represent allele on haploid chromosome copy A and the col3 is represents allele on haploid chromosome copy B.)

Thank you in advance,

MengMeng