Annotation files not set after fetching
Closed this issue · 2 comments
Problem
When using genomepy.install_genomes(..., annotation=True)
, the genomepy.genome.Genome
attributes .annotation_bed_file
and .annotation_gtf_file
are set to the paths of the BED and GTF files when the annotations are already available (i.e., they have been fetched before).
However, they are not set (i.e., they are set to None
) when the annotations were actually fetched (i.e., they were not previously available). This behavior is suprising and requires clients to run further checks to determine file paths in one scenario but not another.
To reproduce
First call (genome & annotation not yet available)
import genomepy
genome = genomepy.install_genome("R64-1-1", provider="Ensembl", annotation=True)
# genome & annotation being downloaded
print(genome.annotation_bed_file) # None
print(genome.annotation_gtf_file) # None
print(genome.genome_file) # /path/to/genome/dir/R64-1-1/R64-1-1.fa
Second call (genome & annotation not fetched, because already available)
genome = genomepy.install_genome("R64-1-1", provider="Ensembl", annotation=True)
# genome & annotation *NOT* being downloaded
print(genome.annotation_bed_file) # /path/to/genome/dir/R64-1-1/R64-1-1.annotation.gtf
print(genome.annotation_gtf_file) # /path/to/genome/dir/R64-1-1/R64-1-1.annotation.bed
print(genome.genome_file) # /path/to/genome/dir/R64-1-1/R64-1-1.fa
Expected result
The annotation file paths should be set in the corresponding genomepy.genome.Genome
properties regardless of whether the annotation is fetched or already available.
Workaround
Calling the private method genomepy.genome.Genome._check_annotation_file()
(available on objects returned by genomepy.install_genome()
) with arguments bed
or gtf
for parameter ext
returns the absolute path of the BED or GTF annotation file, respectively, if available.
System info
OS version: Ubuntu 20.04.5 LTS
Python version: 3.7.12
genomepy version: 0.15.0
Thank you for this wonderful issue report ❤️
The fix was tiny, so I pushed it directly to develop. Try it with
pip install git+https://github.com/vanheeringen-lab/genomepy.git@develop
I'll surely try it out in our next release, thanks a lot :)