bad filing on Ensembl
siebrenf opened this issue · 1 comments
Bakers yeast is one of those model organisms you would expect to be used a lot.
Unfortunately, the filing system for this genome, and at least several other fungi is inconsistent.
Provider: Ensembl
example genome: ASM280432v1
expected: ftp://ftp.ensemblgenomes.org/pub/fungi/release-48/fasta/saccharomyces_cerevisiae/dna/Saccharomyces_cerevisiae.dna_sm.toplevel.fa.gz
genomepy: ftp://ftp.ensemblgenomes.org/pub/fungi/release-48/fasta/saccharomyces_cerevisiae_gca_002804325/dna/Saccharomyces_cerevisiae_gca_002804325.ASM280432v1.dna_sm.toplevel.fa.gz
real: ftp://ftp.ensemblgenomes.org/pub/fungi/release-48/fasta/saccharomyces_cerevisiae/dna/Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz
The error lies in url_name
, which is given by Ensembl. We could change this to look for a partial match instead of an exact match.
Add this to a FAQ / known issues. Don't spend a lot of time on fixing this. Genomepy also won't work well for bacteria.