UCSC MySQL mirror selection
Closed this issue · 6 comments
Hi,
I am having issues with running genomepy install hg38
via the API within a Gitlab CI/CD environment on our internal gitlab instance although it works fine locally:
I am wondering if there is a way to specify a MySQL mirror for USCS genome downloads such as adding an option to use genome-euro-mysql.soe.ucsc.edu
here:
genomepy/genomepy/providers/ucsc.py
Line 523 in d641c7a
since it's the alternative mysql server:
https://genome.ucsc.edu/goldenPath/help/mysql.html
Or maybe the code can try the Europe server if the US server fails. Let me know what you think.
Hey @gokceneraslan,
this could be a nice feature, but rather situational. The difference between your local machine and the CI server is most likely the firewall. Changing the mirror probably won't help in that case :(
It could still be nice to switch to the EU mirror in case the US mirror is down. This could be done fairly easily for installing data, but for downloading the UCSC assembly metadata it would be less easy. I'll think a bit about this...
Thank you!
Another related question: is it normal to see the same error even when we call genomepy.install_genome(annotation=False)
? Because what we want is just the genome sequence, not annotations.
I've made it into a config option in #242. Not super elegant, but it works :)
try it with
pip install git+https://github.com/vanheeringen-lab/genomepy.git@ucsc
genomepy config generate
and edit the ucsc_mirror
in the config file.
And to your other question: the MySQL server is always queried to populate the UCSC metadata. So if you still get the error with this PR, it's very likely a firewall issue.
genomepy uses MySQL for UCSC starting from version 10, so you could also try an older version.
This feature is now available in genomepy 0.16!