Request details on programmatic database setup for confindr
Closed this issue · 5 comments
Hi,
I used the below command:
confindr_database_setup -s key_secret.txt -o confindr_database/
And obtained the database for only three species as below:
confindr_database$ ls
Escherichia_db_cgderived.fasta Salmonella_db_cgderived.fasta gene_allele.txt rMLST_combined.fasta
Listeria_db_cgderived.fasta download_date.txt profiles.txt refseq.msh
However, I need the db_cgderived.fasta for Yersinia and Campylobacter genus as well!
May i know how to obtain those as well programatically?
Best Regards,
Bala
Hi Bala,
Since you have the rMLST database, you don't need the CGE-derived files. Just run ConFindr in rMLST mode (use the --rmlst
flag), and any bacterial genus should be able to be processed.
A
Based on the fact that the Escherichia samples had 38310 bases as the bases examined, it looks like you're still not using the --rmlst
mode. Could you please include the command line call to ConFindr you used?
The bases examined are the total number of bases present in the sequence files containing the alleles returned by the KMA screen (this can be printed to the screen using the --verbosity debug
argument). This sequence file can be inspected if you use the -k
argument to keep the files. It is named as follows: sample_name_alleles.fasta, e.g. FIAR-847_S5_1_trim_alleles.fasta.
If you are using CGE-derived databases, the alleles in the FASTA file should have names like b0436_1, while if you are using the rMLST database, the alleles should have names like BACT000001_10671.
A
I'll close this issue in 30 days if there's no further updates!
Closed due to stale issue