/CandiHap

A haplotype analysis toolkit for natural variation study.

Primary LanguagePerl

CandiHap: a haplotype analysis toolkit for natural variation study.

CandiHap is a user-friendly local software, that can fast preselect candidate causal SNPs from Sanger or next-generation sequencing data, and report results in table and exquisite vector-graphs within a minute. Investigators can use CandiHap to specify a gene or linkage sites based on GWAS and explore favourable haplotypes of candidate genes for target traits. CandiHap can be run on computers with Windows, Mac OS X, or Linux platforms in graphical user interface or command lines, and applied to any species of plant, animal and microbial. CandiHap is publicly available at https://github.com/xukaili/CandiHap or https://bigd.big.ac.cn/biocode/tools/BT007080 as an open-source software. The analysis of CandiHap can do as the followings:

    1). Convert the VCF file to the hapmap format for CandiHap (vcf2hmp);
    2). Haplotype analysis for a gene (CandiHap);
    3). Haplotype analysis for all genes in the LD regions of a significant SNP one by one (GWAS_LD2haplotypes);
4). Haplotype analysis for Sanger sequencing data of population variation (sanger_CandiHap.sh).

And a new CandiHap V2 R package is publicly available at https://github.com/guokai8/CandiHap

Download:

Download All Files:      BioCode: https://ngdc.cncb.ac.cn/biocode/tools/BT007080

License

Academic users may download and use the application free of charge according to the accompanying license.
Commercial users must obtain a commercial license from Xukai Li.
If you have used the program to obtain results, please cite the following paper:

Xukai Li☯* (李旭凯), Zhiyong Shi☯ (石志勇), Jianhua Gao (高建华), Xingchun Wang (王兴春), kai Guo* (郭凯). CandiHap: a haplotype analysis toolkit for natural variation study. Molecular Breeding, 2023, 43:21. doi: https://doi.org/10.1007/s11032-023-01366-4
(☯ Equal contributors; * Correspondence)


Dependencies

perl 5, R ≥ 3.2 (with ggplot2, agricolae, pegas and sangerseqR), and electron.

Figures

CandiHap Fig. 1 | Overview of the CandiHap process. a, A GWAS result. b, General scheme of the process. c, The histogram of phenotype. d, The statistics of haplotypes and significant differences haplotypes are highlighted by color boxes. e, Gene structure and SNPs of a critical gene. f, Boxplot of a critical gene’s haplotypes.

Rice-2018_Nat_Commun_9_735 Fig. 2 | Haplotype analysis of the ARE1 gene in rice compared with the results by Wang et al. 2018, Nat. Commun. 9, 735. a, Gene structure and SNPs of ARE1. b, Major haplotypes of SNPs in the ARE1 coding region of 2747 rice varieties. c, The haplotype results of ARE1 coding region of 3023 rice varieties using CandiHap (SNPs data were downloaded from RFGB). Major SNP haplotypes and casual variations in the encoded amino acid residues are shown. The five more SNPs is due to the fact that there are 276 more rice varieties used in our study (highlighted by blue boxes), and two errors highlighted by red boxes.

Contact information

In the future, CandiHap will be regularly updated, and extended to fulfill more functions with more user-friendly options.
For any questions please contact xukai_li@sxau.edu.cn or xukai_li@qq.com