bio-miga/miga

Create database

gwL955 opened this issue · 6 comments

I have a collection of fn files for bacteria,How do I create my reference genome as a database?

Hello @gewenlong365
Do you mean you want to create a new database containing these genomes? (E.g., to make it publicly available, or to compare genomes from other projects against this database)
Or, do you mean you want to obtain a reference database to compare your genomes against? (E.g., to classify your genomes taxonomically, or to find the closest relatives published elsewhere).

Also, do you want to use the CLI locally or do you prefer using web-based interfaces?

Best wishes,
Miguel

@lmrodriguezr Thanks for reply
yes, i want to obtain a reference database to compare unknown genomes against for classification
how can i do ?
i try to use miga add,but it seems to have failed
Thanks again~

@lmrodriguezr The process is as follows:

miga new -P ./te_14 -t genomes
miga add -P ./te_14/ -t genome -i assembly  ../test/db/genome/*
miga classify_wf -l ./te_14/ -o out_test ./assembly.fasta

@lmrodriguezr hello!MiGA team~
I think I've found a good enough example to do what I need
https://miga.gitbook.io/main-1/10WEtzu06EHwzW762VwX/miga-aws-cli-projects/classify-a-genome/using-your-own-project
but Run command miga ls -P . -p
is that everything appears to be done,still have .pid files

ls daemon
d  daemon.json  maintenance  MiGA:pseudo.output  MiGA:pseudo.pid  p  status.json

miga ls -P . -p
         name  raw_reads  trimmed_reads  read_quality  trimmed_fasta  assembly  cds   essential_genes  mytaxa  mytaxa_scan  taxonomy  distances  ssu   stats
         ----  ---------  -------------  ------------  -------------  --------  ---   ---------------  ------  -----------  --------  ---------  ---   -----
P_alcaligenes  -          -              -             -              done      done  done             done    done         done      done       done  done 
P_fluorescens  -          -              -             -              done      done  done             done    done         done      done       done  done 
  P_mendocina  -          -              -             -              done      done  done             done    done         done      done       done  done 
     P_putida  -          -              -             -              done      done  done             done    done         done      done       done  done 
   P_stutzeri  -          -              -             -              done      done  done             done    done         done      done       done  done 
   P_syringae  -          -              -             -              done      done  done             done    done         done      done       done  done

ok,sorry
After waiting some time, it was done
I seem to understand that this project can't create its own reference genome library, right?
it would be great if it had that feature
thanks!

Hello @gewenlong365 ,

If you want a pre-indexed general-purpose reference database for classification, you can use TypeMat_Lite. To download TypeMat_Lite, you can use:

miga download -v

That database includes all available genome representatives for validly published species. If you prefer the GTDB taxonomy instead, you can also obtain the GTDB_Anchor_Lite with the same action:

miga download -n GTDB_Anchor_Lite -v

Or, you can see all the available databases:

miga download --list