use of purge haplotigs with falcon-phase
Juke34 opened this issue · 1 comments
Here a resume of the size of my genome after each step
assembly | primary size (bp) | haplotig size (bp) | total (bp) |
---|---|---|---|
falcon unzip | 879494072 | 161125037 | 1040619109 |
falcon unzip post purge Haplotigs | 615743546 | 420594391 | 1036337937 |
falcon phase round1 | 679010065 | 280054206 | 959064271 |
scaffolding with allHic | 679064365 | 280054206 | 959118571 |
falcon phase round2 | 703831469 | 252736878 | 956568347 |
The first round of falcon phase sounds to re-incorporate within the primary assembly a part of what has been filtered out by purge haplotigs from the primary assembly.
Is it something expected? I just wondering if this result sounds normal or if I should try different parameters within falcon-phase.
Hi,
In general, yes, FALCON-Phase will move sequence from the haplotigs back into the primary contigs (and vice versa). What it is doing is looking at the Hi-C data to see if it can determine heterozygous sequences that likely originated from the same molecule in the nucleus. The long reads alone can often be insufficient information for correct phasing.
Thanks,
Shawn