ArtPoon/gotoh2

Strange bit matrix behaviour with local alignment

Closed this issue · 8 comments

If we run:

        self.g2.set_model('HYPHY_NUC')
        self.g2.gap_open_penalty = 5
        self.g2.gap_extend_penalty = 1
        self.g2.is_global = False
        result = self.g2.align('AT', 'ATTTTT')
        expected = ('AT----', 'ATTTTT', 10)
        self.assertEqual(expected, result)

then the result is fine. If we append a T then we throw an exception:

traceback failed: i=2 j=2 bit=64
RuntimeError: Traceback failed, try local alignment

where I've added extra debugging text

These are the bit matrices after edge assignment:

ATTTTT: works
 0 80 80  80  0 80 16 0 
80  4 74 106 98 46  8 0 
64 96  4  66 98 98 38 0 
 0  0  0   0  0  0  0 4 

ATTTTTT: fails
80 80 80 80 80  0 80 16 0 
80 80 56 40 40 32 44  8 0 
64 96 64 32 32 32 32  4 0 
 0  0  0  0  0  0  0  0 4 

Bit matrices before edge assignment:

80 80 80 80 80 80 16 0 
80 84 58 42 42 46 12 0 
64 97 68 38 38 38  6 0 
 0  0  0  0  0  0  0 4 

80 80 80 80 80 80 80 16 0 
80 84 58 42 42 46 44 12 0 
64 97 68 38 38 38 38  4 0 
 0  0  0  0  0  0  0  0 4 

Simpler case: align A against ATTTTT. ATTTT works okay.
The cost matrix is:

* A T T T T T
* 0 0 0 0 0 0 0
A 0 -5 1 2 3 4 4

Something interesting happens at the highlighted entry -- the Q value 4 equals the mismatch penalty 4. Right of this, Q exceeds this value. It looks like this results in a dropped b bit at the lower-right corner, which causes the entire row to get zeroed out at edge assignment (step 8).

Tentatively closing...

Found an MRE, see issue16 branch

art@Kestrel:~/git/gotoh2$ python3 tests/test.py TestIssues.test_issue16a
traceback failed: i=1 j=1 bit=0
E
======================================================================
ERROR: test_issue16a (__main__.TestIssues)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "tests/test.py", line 284, in test_issue16a
    self.g2.align(ref, query)
  File "/usr/local/lib/python3.5/dist-packages/gotoh2-0.1-py3.5-linux-x86_64.egg/gotoh2.py", line 99, in align
    self.matrix
RuntimeError: Traceback failed, try local alignment

----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (errors=1)

The test sequences are ACGTACG and TC.
The cost matrix (R) is:

0 0 0 
0 4 4 
0 4 -1 
0 4 8 
0 -5 6 
0 4 -1 
0 4 -1 
0 4 8 

but the positive entries (>0, high cost) are being zeroed out for local alignment:

0 0 0 
0 0 0 
0 0 -5 
0 0 0 
0 -5 0 
0 0 -1 
0 0 -5 
0 0 0 

Actually I just noticed something wrong here -- zeroing out entries in the R matrix while looping through rows and columns screws up the end result.