Single Sequence based Genome Level Unsupervised Genomic Island Prediction Algorithm

This repository contains the original implementation of SSG-LUGIA, an unsupervised learning based tool to predict genomic islands.

Project Website

SSG-LUGIA Project Website



The codes for SSG-LUGIA are written in python and can be found here


Note : In the latest version of Scikit-Learn the implementation of EllipticEnvelope has been changed, so please use the specified version to obtain reproducible results.

  • numpy==1.17.0
  • biopython==1.70
  • tqdm==4.19.5
  • scikit-learn==0.19.1


  1. Clone this repository

$ git clone

  1. Install the requirements

$ pip3 install -r requirements.txt

  1. Navigate to the /codes directory

  2. Launch Python CLI

$ python3

  1. Import the SSG-LUGIA pipeline

from main import SSG_LUGIA

  1. Execute it with a genome sequence fasta file and a standard model name from SSG-LUGIA-F, SSG-LUGIA-R, SSG-LUGIA-P


  1. Alternatively, the model name can be omitted and the user can set the parameters interactively


  1. Alternatively, the user can input a custom model as dictionary


  1. Alternatively, the user can create a model based on their requirement, save it as a json file and input the path to the json file


Model Parameters

SSG-LUGIA combines several sequence based features to infer GIs using an unsupervised anomaly detection pipeline. The various model parameters can be found in SSG-LUGIA Model Parameters. Users can develop custom model variants by changing these parameters and also save the model as json for future use.

Citation Request

If you use SSG-LUGIA in your project, please cite the following paper

  title={SSG-LUGIA: Single Sequence based Genome Level Unsupervised Genomic Island Prediction Algorithm},
  author={Ibtehaz, Nabil and Ahmed, Ishtiaque and Ahmed, Md Sabbir and Rahman, M Sohel and Azad, Rajeev K and Bayzid, Md Shamsuzzoha},
  journal={Briefings in Bioinformatics},