Question in training data label generation code - get_candidates.py

Dear ClairS Team,

While reviewing the code at the following link:

Line 348 in c464c98

    
           normal_af, vt, max_af = find_candidate_match(alt_info_dict=alt_dict[pos].alt_dict, ref_base=ref_base,

I noticed a potential issue related to the find_candidate_match function.
It appears that the function may be utilizing the tumor alt_dict instead of the normal alt_dict.
Could you please take a moment to verify this?

Thank you for your time and support!

Best regards,

Hi, @quito418,

Billions thanks for reporting this! Should be normal alt_dict(paired_alt_dict). I guess it would only exclude a small proportion of candidates in training. We will try to add those candidates to training to further verify the results.

Zhenxian

Thanks, @zhenxian! Looking forward to the release of ClairS.