vanheeringen-lab/genomepy

Annotation files not set after fetching

Closed this issue · 2 comments

Problem

When using genomepy.install_genomes(..., annotation=True), the genomepy.genome.Genome attributes .annotation_bed_file and .annotation_gtf_file are set to the paths of the BED and GTF files when the annotations are already available (i.e., they have been fetched before).

However, they are not set (i.e., they are set to None) when the annotations were actually fetched (i.e., they were not previously available). This behavior is suprising and requires clients to run further checks to determine file paths in one scenario but not another.

To reproduce

First call (genome & annotation not yet available)

import genomepy

genome = genomepy.install_genome("R64-1-1", provider="Ensembl", annotation=True)
# genome & annotation being downloaded
print(genome.annotation_bed_file)  # None
print(genome.annotation_gtf_file)  # None
print(genome.genome_file)  # /path/to/genome/dir/R64-1-1/R64-1-1.fa

Second call (genome & annotation not fetched, because already available)

genome = genomepy.install_genome("R64-1-1", provider="Ensembl", annotation=True)
# genome & annotation *NOT* being downloaded
print(genome.annotation_bed_file)  # /path/to/genome/dir/R64-1-1/R64-1-1.annotation.gtf
print(genome.annotation_gtf_file)  # /path/to/genome/dir/R64-1-1/R64-1-1.annotation.bed
print(genome.genome_file)  # /path/to/genome/dir/R64-1-1/R64-1-1.fa

Expected result

The annotation file paths should be set in the corresponding genomepy.genome.Genome properties regardless of whether the annotation is fetched or already available.

Workaround

Calling the private method genomepy.genome.Genome._check_annotation_file() (available on objects returned by genomepy.install_genome()) with arguments bed or gtf for parameter ext returns the absolute path of the BED or GTF annotation file, respectively, if available.

System info

OS version: Ubuntu 20.04.5 LTS
Python version: 3.7.12
genomepy version: 0.15.0

Thank you for this wonderful issue report ❤️

The fix was tiny, so I pushed it directly to develop. Try it with

pip install git+https://github.com/vanheeringen-lab/genomepy.git@develop

I'll surely try it out in our next release, thanks a lot :)