micronota is an open-source, BSD-licensed package to annotate microbial genomes and metagenomes.
As Python 3 matures and majority python packages support Python 3, the scientific Python community is in favor of dropping Python 2 compatibility. Thus, micronota will only support Python 3. This will allow micronota to have few dependency and avoid maintenance of Python 2 legacy code.
micronota can annotate multiple features including coding genes, prophage, CRISPR, tRNA, rRNA and other ncRNAs. It has a customizable framework to integrate additional tools and databases. Generally, the annotation can be classified into 2 categories: structural annotation and functional annotation. Structural annotation is the identification of the genetic elements on the sequence and functional annotation is to assign functions to those elements.
To install the latest release of micronota:
conda install micronota
Or you can install through pip
:
pip install micronota
To install the latest developping version:
pip install git+git://github.com/biocore/micronota.git
To prepare (download and format) the files of TIGRFAM to the right form read by micronota:
micronota database prepare tigrfam --cache_dir ~/database
By default, micronota will read ~/.micronota.conf
file to set up the environment or tune the parameters, if this config file exists.
For example, the default directory to store the database files is ~/micronota_db
, but you can override it to /home/username/db
by setting this in ~/.micronota.conf
:
[GENERAL] db_path = /home/username/db
micronota will look for the key db_path
in the section GENERAL
to update the database path.
Besides setting up the environment, you can also specify the tools to run and the order to run. Here is an example:
[GENERAL] # overwrite the default setting db_path = db [FEATURE] # run prodigal first prodigal = 1 # don't run infernal infernal = 0 # next to annotate CDS [CDS] # run diamond tegother with uniref database diamond = uniref # skip running hmmer hmmer = 0
The format of the config file is widely used in different OS platforms and described here. 0
/ 1
, no
/ yes
, false
/ true
, on
/ off
can all be used to turn off or on each tool. If the tool need a database file to run with, specify the database instead of the indicator.
You can specify the parameter for each individual tools. For example, if you want to run Prodigal with genetic translation table 1, instead of the default translation table, you can create a file param.cfg:
[prodigal] # set translation table to 1 -t = 1
Here, Prodigal has an option -t
to specify translation table, so you set -t
to 1
. All the options of all the supported tools should be able to be set up this way.
After creating the config file, then you can run:
micronota annotate -i input.fa -o output_dir --param param.cfg
To check the micronota setup, you can run:
micronota info
It will print out the system info, databases available, external tools, and other configuration info.
Features | Supported | Tools |
---|---|---|
coding gene | yes | Prodigal |
tRNA | ongoing | Aragorn |
ncRNA | yes | Infernal |
CRISPR | ongoing | MinCED |
ribosomal binding sites | ongoing | RBSFinder |
prophage | ongoing | PHAST |
replication origin | todo | Ori-Finder 1 (bacteria) & Ori-Finder 2 (archaea) |
microsatellites | todo | nhmmer? |
signal peptide | ongoing | SignalP |
transmembrane proteins | ongoing | TMHMM |
Databases | Supported |
---|---|
TIGRFAM | yes |
UniRef | yes |
Rfam | ongoing |
To get help with micronota, you should use the micronota tag on Biostars. The developers regularly monitor the micronota
tag on Biostars.
If you’re interested in getting involved in micronota development, see CONTRIBUTING.md.
See the list of micronota’s contributors.
micronota is available under the new BSD license. See COPYING.txt for micronota’s license, and the licenses directory for the licenses of third-party software and databasese that are (either partially or entirely) distributed with micronota.