/Bayesian-Classifier

An application of Naive Bayes to detect genes essentiality for bacterium's life.

Primary LanguagePythonMIT LicenseMIT

Genes Classification

Introduction

This project aims to predict through Bernoulli Naive Bayes algorithm the essentiality of a gene for some bacterium's life.

Run

  1. Download the repository

  2. Run main.py

  3. The script will output S.Mikatae dataset classification results and plot the ROC curve of the prediction about S.Cerevisiae dataset.

Implementation

The datasets have been read with Pandas and then converted in Numpy arrays. After they have been discretized. Some features in S.Mikatae contained unknown values, that have been changed to '0'.

10-fold cross validation has been used to evaluate the classifier's accuracy.

Results

Both classification and ROC curve computation seem to be consistent with the compared reference.

ROC

Requirements

Software Version Required
Python >= 3 Yes
Numpy (Python Package) Tested on v1.13.3 Yes
Scikit-learn (Python Package) Tested on v0.19.1 Yes
Pandas (Python Package) Tested on v0.21.1 Yes
Matplotlib Tested on v2.1.1 Yes