/pgcgap

The Prokaryotic Genomics and Comparative Genomics Analysis Pipeline

Primary LanguagePerlGNU General Public License v3.0GPL-3.0

PGCGAP - the Prokaryotic Genomics and Comparative Genomics Analysis Pipeline

Platform License Version Downloads conda install with bioconda

English Readme | Chinese Readme

	  ____       ____      ____     ____       _        ____    
	U|  _"\ u U /"___|u U /"___| U /"___|u U  /"\  u  U|  _"\ u 
	\| |_) |/ \| |  _ / \| | u   \| |  _ /  \/ _ \/   \| |_) |/ 
	 |  __/    | |_| |   | |/__   | |_| |   / ___ \    |  __/   
	 |_|        \____|    \____|   \____|  /_/   \_\   |_|      
	 ||>>_      _)(|_    _// \\    _)(|_    \\    >>   ||>>_    
	(__)__)    (__)__)  (__)(__)  (__)__)  (__)  (__) (__)__)   

Contents

Introduction

PGCGAP is a pipeline for prokaryotic comparative genomics analysis. It can take the pair-end reads as input. In addition to genome assembly, gene prediction and annotation, it can also get common comparative genomics analysis results such as phylogenetic trees of single-core proteins and core SNPs, pan-genome, whole-genome Average Nucleotide Identity (ANI), orthogroups and orthologs, COG annotations, substitutions (snps) and insertions/deletions (indels) and antimicrobial and virulence genes mining with only one line of commands.

Installation

The software was tested successfully on Windows WSL, Linux x64 platform and macOS. Because this software relies on a large number of other softwares, so it is recommended to install with Bioconda.

Step1: Install PGCGAP

$conda create -n pgcgap python=3
$conda activate pgcgap
$conda install pgcgap

Notice: What should we do when the installation is slow? As more and more software is contained in CONDA and the index files become larger, the search space for the software that satisfies all the software dependencies in the environment becomes larger and larger when installing a new software, making "Solving Environment" slower and slower. Sometimes we can't even install the software through CONDA. In fact, we can do something instead of just waiting.

  • Method 1: use mamba to deal with the slow development of "solving environment" when using CONDA.

      $conda activate pgcgap
      $conda install mamba -c conda-forge
      $mamba install pgcgap
      
  • Method 2: use "environment.yaml" we provided to deal with the slow development of "solving environment" when using CONDA. Run the following commands to download the latest environmental file and install PGCGAP:

      # download pgcgap_latest_env.yml
      $wget https://github.com/liaochenlanruo/pgcgap/blob/master/conda/pgcgap_latest_env.yml
      
      # create a conda environment named as pgcgap and install the latest version of PGCGAP
      $conda env create -f pgcgap_latest_env.yml
      

Step2: Setup COG database (Users should execute this after first installation of pgcgap)

$conda activate pgcgap
$pgcgap --setup-COGdb
$conda deactivate

Users with docker container installed have another choice to install PGCGAP.

$docker pull quay.io/biocontainers/pgcgap:<tag>

(see pgcgap/tags for valid values for <tag>)

Required dependencies

License

PGCGAP is free software, licensed under GPLv3.

Feedback and Issues

Please report any issues to the issues page or email us at liaochenlanruo@webmail.hzau.edu.cn.

Citation

If you use this software please cite: Liu H, Xin B, Zheng J, Zhong H, Yu Y, Peng D, Sun M. Build a bioinformatics analysis platform and apply it to routine analysis of microbial genomics and comparative genomics. Protocol exchange, 2021. DOI: 10.21203/rs.2.21224/v5

Usages

For more detial informations, please visit the webpage of PGCGAP and WIKI.