RNAcentral/rnacentral-sequence-search

Switch to nfs to manage fasta database

Closed this issue · 4 comments

Switch to nfs to manage fasta database

Currently the cloud-nfs branch contains the nfs server and clients and is deployed on test. Now we need to:

  • deploy this branch on production infrastructure and if successful, merge into master
  • develop a mechanism for uploading new FASTA files without recreating the infrastructure

About the mechanism for uploading new FASTA files, we could:

1- Use rsync on FTP and producer servers
2- Create the MD5 file from the sequence-database.fa.tar.gz archive and write a script to periodically check its content and update if necessary

I don't think we need to recreate the infrastructure, currently we need to run an Ansible command.

Yep, this could work!

We'd also need to think about how to have separate sequence databases for the production and test environments. For example, if we are testing a new release, we don't want to roll it out accidentally in production.

We could have two FTP folders and associate one with what is now called production and another with what is now called test and then periodically rsync as you suggested.