Formatted NR database
stephen-14 opened this issue · 2 comments
Hello everyone, sorry for my silly trouble but could any one help me to get Diamond makedb Database?
I'm trying to use diamond to search against NR database to get taxonomic information. I want to get TaxIDs. However, my own PC capability is small 256GB while the database is large and it's always interrupted during Makedb due to no space left on divide. I've tried to extend swapfile.., but it seems not work.
I use this command:
"diamond makedb --in nr.gz --db nr_diamond.dmnd --taxonmap prot.accession2taxid.gz --taxonnodes nodes.dmp --taxonnames names.dmp"
Could you help me share the formatted database of NR or where I could download it.
Many Thanks!.
It's not available for download somewhere. You need a system with a larger hard drive.
Hi @bbuchfink I am updating my diamond database with the current NCBI viral protein Ref seq. I only created the database once and I was wondering if this is the correct way to do it to get the full taxonomic lineage (including viral family). I know you added taxonomic family to the current updated version of Diamond, I was just wondering if this is included in names.dmp or other .dmp files?
Here is how I am creating my database:
diamond makedb --in /work/kvigil/diamond/ -d viralprotein.050124 --taxonmap /work/kvigil/diamond/prot.accession2taxid.FULL.gz --taxonnames /work/kvigil/diamond/names.dmp --taxonnodes /work/kvigil/diamond/nodes.dmp
I am not sure if I need the fullnamelineage.dmp or not? Thanks for your help and thank you for adding the family taxonomy!