A novel architecture for metagenomic classification defined as phylogenetic convolutional neural network, as presented in the paper Phylogenetic Convolutional -Neural Networks in Metagenomics.
These instructions will get you a copy of the project up and running on your local machine.
git clone https://gitlab.fbk.eu/MPBA/phylogenetic-cnn.git
The DAP (Data Analysis Protocol) Project is included in this repo as an external reference (i.e. Git Submodule).
Therefore, the first time this repo is cloned, the Git Submodule must be initialised - after the clone
command,
you should see a dap
directory in your cloned copy which is empty.
Thus:
cd dap
git submodule init
git submodule update
You could do the same operations in just one line:
git clone --recursive https://gitlab.fbk.eu/MPBA/phylogenetic-cnn.git
A complete conda environment is provided as a .yml
file in the folder envs
.
Additionally it is required to install the mlpy
library. Further instructions to install MLPY 3.5.0 Python package are reported in the
README.md
file, in the envs/deps
folder.
One can select the algorithm (SVM, random forrest, MLP, ph-cnn) to be used by simply decide which runner to execute.
multilayerperceptron_runner.py
: Multi-Layer Perceptronphylocnn_runner.py
: Phylogenetic Convolutional Neural Networkrandomforest_runner.py
: Random forestsvm_runner.py
: Support Vector Machinetransfer_learning_runner.py
: Phylogenetic Convolutional Neural Network used for transfer learning. It assumes that pre-trained network weights are provided (seeweights
folder).
In order to configure how the program runs one needs to modify the following files:
settings.py
- where it can be chose which type of data we want to load, where are the data, where to output, etc...dap/settings.py
- where it can be set how the DAP is supposed to operate. More informations are available in the readme in thedap
folder and in the paper.dap/deep_learning_settings.py
- where all the settings specific for deep learning can be set.
-
PhyloConv1D: In this notebook we report code examples and explanations on how to use the new PhyloConv1D Keras layer. We use experimental data, as examples.
-
Embedding - ICDf data: In this notebook we report results and plots of embeddings of Phylo-Convolutional Layers calculated on data of ICDf disease included in the IBD dataset, as reported in the paper.
This project is licensed under GNU General Public License v3.0 GNU GPLv3 - see the LICENSE.txt file for details