konradjk/loftee

LofTee releases - grch37 vs. grch38

Closed this issue · 4 comments

Hi (cc @vladsaveliev),

From what I can see, there are different branches for grch37 (master) and grch38,. Yet, there are several inputs on how to add grch37 tracks (conservation etc) within the README of the grch38 branch. I am slightly confused as to which branch is most preferred and up-to-date, is there any recommendation here that I have missed?

kind regards,
Sigve

Ah sorry for the confusion, some of that is legacy text. If you're using GRCh38, definitely use that branch. Otherwise use the master branch!

@konradjk, could you please clarify the difference between the branches just a bit more? From a quick glance at the difference in commits between master and grch38, I can't seem to spot anything build specific (though might be wrong). Mainly in grch38, A BigWig file is used instead of GERP scores file and the PhyloCSF db, thus the inputs are a bit different. Also the consequence is that I'm having an error WARNING: Failed to compile plugin LoF: Can't locate Bio/DB/BigWig.pm in @INC, I guess the VEP installation script didn't install this lib because it likely uses the main branch of LoFtee. Thinking how to install it in a clearest way, ideally with the VEP installation script or with conda. Finally, human_ancestor.fa.rz is disabled by default in grch38, explains why I wasn't having an error when not providing this file initially. Is there anything else I'm missing?

Yes, it's mostly about the inputs. In order to simplify things for the GRCh38 version, I switched to BigWig (which is now a dependency that I thought was installed by VEP but I guess not? I can add that to dependencies). There is no human_ancestor.fa.rz for GRCh38 (the .rz is no longer really used by samtools which now allows .gz), so we've switched that over as well.

You're right that there's nothing about the codebase itself that hard codes anything about the reference, but I've decided to make the branch to make a clean version where I can make changes that are breaking for GRCh37 going forward (but still keep the old branch so people can still use it).

Cool, thanks for the clarification!