A much reduced version of the eggNOG database for testing purposes.
To reduce size, the following procedure was followed:
- Choose one of the smallest taxonomically restricted databases, thaNOG.
- Arbitrarily select an orthologous group (OG) that is represented in thaNOG. The OG 1CB2I was chosen.
- Using SQL, reduce the
event
,member
, andog
tables in theeggnog.db
database to only a few entries each (see this script) - Find a fasta sequence represented in the OG (Nlim_1033 was chosen, with the sequence downloadable here). Shorten the sequence for querying.
- Make a DIAMOND database from the single FASTA sequence using
diamond makedb
- Create a custom hmmpressed database from the HMM model for the OG 1CB2I at this link
- In OG_fasta, remove most of the FASTA databases associated with OGs. Only keep 1CB2[A-Z].
Using this database depends on either using this functionality (also here) in eggnog-mapper
to specify a data directory independently of the install directory, or copying tinyNOG/data
into eggnog-mapper/data
. For the first, add --data_dir ~/path_to/tinyNOG/data
to the command line invocation of emapper.py
.