Create database

Question

Create database

gwL955 opened this issue 2 years ago · 6 comments

I have a collection of fn files for bacteria,How do I create my reference genome as a database?

Answer 1 · 2023-06-15T08:36:35.000Z

Hello @gewenlong365
Do you mean you want to create a new database containing these genomes? (E.g., to make it publicly available, or to compare genomes from other projects against this database)
Or, do you mean you want to obtain a reference database to compare your genomes against? (E.g., to classify your genomes taxonomically, or to find the closest relatives published elsewhere).

Also, do you want to use the CLI locally or do you prefer using web-based interfaces?

Best wishes,
Miguel

Answer 2 · 2023-06-15T08:47:02.000Z

@lmrodriguezr Thanks for reply
yes, i want to obtain a reference database to compare unknown genomes against for classification
how can i do ?
i try to use miga add,but it seems to have failed
Thanks again~

Answer 3 · 2023-06-15T08:50:18.000Z

@lmrodriguezr The process is as follows:

miga new -P ./te_14 -t genomes
miga add -P ./te_14/ -t genome -i assembly  ../test/db/genome/*
miga classify_wf -l ./te_14/ -o out_test ./assembly.fasta

Answer 4 · 2023-06-16T06:04:18.000Z

@lmrodriguezr hello!MiGA team~
I think I've found a good enough example to do what I need
https://miga.gitbook.io/main-1/10WEtzu06EHwzW762VwX/miga-aws-cli-projects/classify-a-genome/using-your-own-project
but Run command miga ls -P . -p
is that everything appears to be done,still have .pid files

ls daemon
d  daemon.json  maintenance  MiGA:pseudo.output  MiGA:pseudo.pid  p  status.json

miga ls -P . -p
         name  raw_reads  trimmed_reads  read_quality  trimmed_fasta  assembly  cds   essential_genes  mytaxa  mytaxa_scan  taxonomy  distances  ssu   stats
         ----  ---------  -------------  ------------  -------------  --------  ---   ---------------  ------  -----------  --------  ---------  ---   -----
P_alcaligenes  -          -              -             -              done      done  done             done    done         done      done       done  done 
P_fluorescens  -          -              -             -              done      done  done             done    done         done      done       done  done 
  P_mendocina  -          -              -             -              done      done  done             done    done         done      done       done  done 
     P_putida  -          -              -             -              done      done  done             done    done         done      done       done  done 
   P_stutzeri  -          -              -             -              done      done  done             done    done         done      done       done  done 
   P_syringae  -          -              -             -              done      done  done             done    done         done      done       done  done

Answer 5 · 2023-06-16T06:30:49.000Z

ok,sorry
After waiting some time, it was done
I seem to understand that this project can't create its own reference genome library, right?
it would be great if it had that feature
thanks!

Answer 6 · 2023-06-20T13:56:19.000Z

Hello @gewenlong365 ,

If you want a pre-indexed general-purpose reference database for classification, you can use TypeMat_Lite. To download TypeMat_Lite, you can use:

miga download -v

That database includes all available genome representatives for validly published species. If you prefer the GTDB taxonomy instead, you can also obtain the GTDB_Anchor_Lite with the same action:

miga download -n GTDB_Anchor_Lite -v

Or, you can see all the available databases:

miga download --list