ArtPoon/gotoh2

Refactor broke traceback

Closed this issue · 6 comments

Original motivation for issue #1.
I've substituted a C struct for the alignment matrices being passed around among the core functions, which tidies things up a fair amount. For the first unit test, the R matrix is computed correctly:

0 6 7 8 
6 -5 1 2 
7 1 -10 -4 
8 2 -4 -6 
9 3 -3 -9 

but the traceback is wrong:

alen i j type
0 4 3 Vertical
1 3 3 Diagonal
2 2 2 D
3 1 1 D

P matrix:

2147483647 2147483647 2147483647 2147483647 
6 12 13 14 
7 1 7 8 
8 2 -4 2 
9 3 -3 0

Q matrix:

2147483647 6 7 8 
2147483647 12 1 2 
2147483647 13 7 -4 
2147483647 14 8 2 
2147483647 15 9 3 

These seem to be correct.

bit matrix:

0 16 16 16 
64 84 50 18 
64 73 84 18 
64 73 73 20 
64 65 65 4 

This also matches what I'm getting from master branch, so the problem is not in matrix computation.
Checking edge assignment next.

bit matrix after edge assignment:

0 23 23 7 
64 84 50 18 
64 79 84 7 
64 79 73 20 
64 7 65 7 

Aha! This is totally different!

I have a suspicion that this is an edge effect problem. The edge assignment procedure is iterating through the bits matrix starting from the lower right cell. It looks up cells to the right, down, and down-right (diagonal). Right off the bat, these cells are unassigned and will be read as random noise that can still be used in bitwise operations.

As an experiment, I started the calculation at the cell to the upper-left of the original starting cell and got this bit matrix after edge assignment:

0 16 16 16 
64 84 55 18 
64 79 84 18 
64 79 73 20 
64 65 65 4 

This is actually closer to what I have been getting in the master branch (note that since the master branch uses the same edge assignment code, it shouldn't be relied on either!). I need to work out what the bit matrices are supposed to look like manually :-/

Bit matrix after edge assignment as of commit 950a81c:

0 16 16 16 
64 4 66 50 
64 17 4 18 
64 25 1 20 
64 73 73 4 

Binary conversion to paths:

none   e   e   e
   g   c  bg bcg
   g  ae   c  be
   g ade   a  ce
   g adg adg   c