marbl/MetagenomeScope

Draw arrowheads in both directions for self-implying edges

Opened this issue · 0 comments

For example, in the Flye Yeast .gv file included in AGB's repository, there exists an edge

"54" -> "55" [label = "id 50\l1.8k 202x", color = "goldenrod" , penwidth = 3, dir = both] ;

If we visualize this graph with dot, then it draws this edge with arrowheads in both directions due to the dir = both property:

image

The AGB visualization of this graph (see component 2) ignores this, as far as I can tell (it only has an arrowhead from 54 → 55, not the other way around; this edge is present in component 2 in the default mode).

From looking at Flye's code (see these lines), it looks like the dir = both property is given to repetitive contigs that are "self complements."

I don't know enough about the Flye codebase to say exactly what this means, but my guess is that this is similar to (or the same situation as?) another rare case we can encounter—when an edge "implies itself" in a graph. Consider this GFA file, in which an edge exists from +2 → -2. The complement of this edge is -(-2) → -(+2) = +2 → -2, meaning that this edge and its complement are identical.

Until now, I have just been detecting these cases and saying "okay, let's just draw one copy of this edge." But it would be nice to draw these edges with arrowheads in both directions -- this way there's still just one edge, but the "implies itself" nature of it is clear.

So, this is low-priority, but it would be nice (for dir = both edges in Flye output, and for self-implying edges in other graphs) to draw two arrowheads. Probably the best way to do this is to pass along a dir property for edges in non-Flye output, also (only if any of the edges are self-implying -- if not we can leave it out). Then the visualization can see this property and adjust how these edges are drawn accordingly. (Ideally we'd also add some help text somewhere explaining this in the interface.)

Besides dir = both cases in Flye DOT files, I think this can only occur in LastGraph and GFA files, at least right now -- this is because for other filetypes (GML and FASTG) we don't "complement" the graph.

Where this happens in the LastGraph parser:

# Only add implied edge if the edge does not imply itself
# (e.g. "ABC" -> "-ABC" or "-ABC" -> "ABC")
if not (id1 == nid2 and id2 == nid1):
digraph.add_edge(nid2, nid1, multiplicity=multiplicity)

Where this happens in the GFA parser:

# Don't add an edge twice if its complement is itself (as in the
# loop.gfa test case)
if complement_tuple != edge_tuple:
digraph.add_edge(*complement_tuple)