/ABC_WGD

ABC for polyploids

Primary LanguageC

ABC_WGD

Install

chmod +x install.sh  
  
./install.sh  

Model

ABC_WGD is made to investigate various models of speciation between a diploid and a tetraploid
Species 1 is considered as being diploid, species 2 is the tetraploid.

![model] Prior distributions will feed the coalescent simulator msnsam, and are generated by priorgen. Priorgenwgd is a binary file from a compiled C++ code. It requires a bpfile and an argfile to work.
Values for all parameters are written in the file priorfile.txt, one line per multilocus simulations.

Summary statistics

Summary statistics are directly computed from the msnsam's output. For each locus, mscalc will compute:

Statistics Description
bialsites number of SNPs in the alignment
sf XY number of fixed differences between species X and Y / locus length
sx X number of exclusively polymorphic positions in species X / locus length
ss XY number of shared biallelic positions between species X and Y/ locus length
pi X Tajima’s Theta within species X
theta X Watterson’s Theta witin species X
pearson_r_pi XY correlation’s coefficient for pi over orthologs between X and Y
pearson_r_theta XY correlation’s coefficient for theta over orthologs between X and Y
Dtaj X Tajima’s D for species X
div XY raw divergence Dxy measured between X and Y
netdiv XY net divergence Da measured between X and Y
minDiv XY smallest divergence measured between one individual from X and one from Y
maxDiv XY highest divergence measured between one individual from X and one from Y
Gmin XY minimum divergence between one sequence from X and one from Y minDivXY divided by the average divXY
FST XY FST between X and Y compute as 1-(pi_X + pi_Z) / (2*pi_XY)

An array of statistics corresponding to the average statistics computed over loci and their standard deviation will be returned every multilocus simulation and written in the file ABCstat.txt.

Usage

To run the simulations, simply use the following command:

run_ABC_polyploid.py [model] [argfile] [bpfile]  
  
Ex: run_ABC_polyploid.py diso2_auto argfile_diso2_auto.txt bpfile  

This example will run 10,000 multilocus simulations, with an autopolyploidization and a disomic inheritance.

run_ABC_polyploid.py will simply execute the pipeline

priorgen | msnsam | mscalc

The output files are ABCstat.txt (containing the computed summary statistics) and priorfile.txt (containing the parameter values used to simulate the data from which the summary statistics were calculated).

Model in [diso1, diso2_auto, diso2_allo, tetra1, tetra2_auto, tetra2_allo, hetero1, hetero2_auto, hetero2_allo]

This package requires:
pypy (has to be linked to the user's bin)
numpy
msnsam

Statistical comparison between "observation" and "simulations" can be made using various R libraries (abc, abcrf).