--haploid setting less sensitive than default
NatPRoach opened this issue · 3 comments
Hello,
I'm not sure if this is intended behavior, but the --haploid
flag appears to be significantly less sensitive at detecting variants than the default setting. Looking at the code in call_var.py it appears to be because of this snippet:
if output_config.is_haploid_mode_enabled:
if (
is_hetero_SNP or is_hetero_ACGT_Ins or is_hetero_InsIns or
is_hetero_ACGT_Del or is_hetero_DelDel or is_insertion_and_deletion
):
return (
(True, False, False, False, False, False, False, False, False, False),
(reference_base_ACGT, reference_base_ACGT)
)
Which based on the later return statement in that function,
return (
(
is_reference, is_homo_SNP, is_hetero_SNP,
is_homo_insertion, is_hetero_ACGT_Ins, is_hetero_InsIns,
is_homo_deletion, is_hetero_ACGT_Del, is_hetero_DelDel,
is_insertion_and_deletion
),
(reference_base, alternate_base)
)
, seems to be returning that anytime there is a hetereozygous variant it defaults to reporting the reference variant when in --haploid
mode.
My expectation was that --haploid
mode would be more sensitive at detecting low frequency variants rather than defaulting to the reference more frequently. If this is intended behavior it may be worth clarifying what --haploid
mode is doing behind the scenes and what assumptions it's making in the --help
statement.
Thanks!
Thanks for the suggestion. Maybe I should provide two modes, --haploid_sensitive
and --haploid_accurate
.
Two new modes added. --haploid_precision
will consider heterozygous alike positions as non-variant.--haploid_sensitive
will consider heterozygous alike positions as variant.
Awesome, thanks for the quick turn around on this!