PoonLab/covizu

Tips in timetree are being rendered outside of panel

Closed this issue · 12 comments

Looks like this problem is at the level of generating the timetree (visualizing timetree.nwk below):
Screenshot 2024-02-20 at 10 41 42 AM

@GopiGugan I'm running some tests on Paphlagon

Using the following script to extract by_lineage.json file from latest provision:

import covizu
from covizu.utils import gisaid_utils
from covizu.utils.progress_utils import Callback
from covizu.utils.batch_utils import *
import os

cb = Callback()

infile = "data/provision.2024-02-17T00:00:21.json.xz"
ref_file = os.path.join(covizu.__path__[0], "data/NC_045512.fa")
vcf_file = os.path.join(
    covizu.__path__[0],
    "data/ProblematicSites_SARS-CoV2/problematic_sites_sarsCov2.vcf")

loader = gisaid_utils.load_gisaid(infile, minlen=29000, mindate="2019-12-01",
                                  debug=None) #100)
batcher = gisaid_utils.batch_fasta(loader, size=2000)
aligned = gisaid_utils.extract_features(
    batcher, ref_file=ref_file, binpath="minimap2", nthread=16, minlen=29000)
filtered = gisaid_utils.filter_problematic(
    aligned, vcf_file=vcf_file, cutoff=0.001, callback=cb.callback)
by_lineage = gisaid_utils.sort_by_lineage(filtered, callback=cb.callback)

with open("iss512.json", 'w') as outfile:
    json.dump(by_lineage, outfile)

Screenshot from 2024-02-22 14-01-45
Reproduced the tree. Looks like this is not a backend problem.

Reproduced the tree. Looks like this is not a backend problem.

I'll take a look at the front end

I believe the issue is that we aren't scaling the nwk tree itself. Branch lengths are being assigned with the values in the nwk file:

covizu/server/phylo.js

Lines 59 to 71 in 10a4017

if (nodeinfo.length==1) {
if (token.startsWith(':')) {
curnode.label = "";
curnode.branchLength = parseFloat(nodeinfo[0]);
} else {
curnode.label = nodeinfo[0];
curnode.branchLength = null;
}
}
else if (nodeinfo.length==2) {
curnode.label = nodeinfo[0];
curnode.branchLength = parseFloat(nodeinfo[1]);
}

I believe the issue is that we aren't scaling the nwk tree itself. Branch lengths are being assigned with the values in the nwk file:

I don't think this is the problem, the branch lengths in this tree are scaled in units of time (years), which is what we want to display. I think the problem is that we are not calculating the maximum tip to root distance in this tree correctly, which would be needed to scale to the horizontal dimension of this panel.

I suspect the problem is somewhere here:

covizu/js/drawtree.js

Lines 122 to 138 in 10a4017

// adjust d3 scales to data frame
if(!redraw) {
switch($("#display-tree").val()) {
case "Other Recombinants":
xScale.domain([
axis_padding,
date_to_xaxis(d3.max(recombinant_tips, function(d) {return d.last_date}))
]);
break;
default:
xScale.domain([
d3.min(org_df, xValue)-axis_padding_trees,
date_to_xaxis(d3.max(org_df, function(d) {return d.last_date}))
]);
break;
}
}

More problems:

  • The timetree axis of sampling dates is not updating with the slider for specifying the minimum date for rendering lineages! sorry, I forgot this was not part of the implementation
  • The box for a given lineage has incorrect horizontal coordinates relative to the timetree axis.

I suspect the problem is somewhere here:

covizu/js/drawtree.js

Lines 122 to 138 in 10a4017

// adjust d3 scales to data frame
if(!redraw) {
switch($("#display-tree").val()) {
case "Other Recombinants":
xScale.domain([
axis_padding,
date_to_xaxis(d3.max(recombinant_tips, function(d) {return d.last_date}))
]);
break;
default:
xScale.domain([
d3.min(org_df, xValue)-axis_padding_trees,
date_to_xaxis(d3.max(org_df, function(d) {return d.last_date}))
]);
break;
}
}

Yes, it looks like this was the issue. The domain was not set correctly. Working on a fix for this

  • The box for a given lineage has incorrect horizontal coordinates relative to the timetree axis.

image

In this case the tree is being drawn correctly, however the earliest collection date that we have is in 2020-11-14. This is the earliest collection date for variants that are being displayed

An easy fix would be to change the label in the tooltip to read "Collection dates (displayed):", which would imply that we are showing the range of collection dates for the subset of variants in the beadplot.