FowlerLab/Enrich2

Feature request: call MNPs

ijhoskins opened this issue · 2 comments

Hello,

I am hoping to compare variant calls using Enrich2 for variants generated with NNK codon mutagenesis. Based on the code in variants.py, it seems multi-nucleotide polymorphisms (MNPs) are not called:

traceback = self.aligner.align(self.wt.dna_seq, variant_dna)
for x, y, cat, length in traceback:
if cat == "match":
continue
elif cat == "mismatch":
mut = "{pre}>{post}".format(pre=self.wt.dna_seq[x], post=variant_dna[y])

Would it be too difficult to support this functionality?

I'm sorry, I will have to test out the code to see, but it appears perhaps haplotypes can be called via variants.count_variants()?

for pos, change in mutations:
ref_dna_pos = pos + self.wt.dna_offset + 1
ref_pro_pos = pos / 3 + self.wt.protein_offset + 1
mut = "c.{pos}{change}".format(pos=ref_dna_pos, change=change)

Thanks for checking out Enrich2! The software does indeed support NNK codon mutagenesis and will call variants with multiple mutations.

One caveat is that Enrich2 filters based on the number of nucleotide changes in a variant, not the number of affected codons. Make sure you set the maximum mutations in your configuration to be high enough that your multi-mutants are not filtered out.

You may also need to do some post-analysis filtering if you want to limit your analysis to the variants that have mutations in only one codon.