Nextflow pipeline that calculates genotype likelihoods in angsd from a list of bamfiles and plots admixture through NGSadmix and PCAs through PCAngsd.
NOTE: please see the documentation for branch jn
for further instructions
- Install
nextflow
(version >= 19.04) - Install
Conda
(version >= 4.10) - Download (git clone) this repository:
git clone https://github.com/FilipThorn/nf-GL_popstructure
- Download and install
PCAngsd
- Run nextflowpipeline:
nextflow run GL_popstr.nf --bams /PATH/TO/BAMFILELIST/'*.list' --outdir /PATH/TO/RESULTS/ --chr_ref /PATH/TO/CHROMOSOMELIST
-
bam file list example:
/Absolute/PATH/IndvXXXX/IndvXXXX_sorted.bam /Absolute/PATH/IndvXXXX/IndvXXXX_sorted.bam /Absolute/PATH/IndvXXXX/IndvXXXX_sorted.bam /Absolute/PATH/IndvXXXX/IndvXXXX_sorted.bam
Lables in plots are based on the subdirectory name
/results/Indv0001/Indv0001_sorted.bam
if you have a different file structure you can run the pipeline with the flag --skip_plots true and create your plots on your own
-
chrosome reference file example:
chr1 chr2 chr3 chr4 chr5
Subset of scaffolds present in your bamfiles
Use of a HPC is recomended. Create a nextflow config profile that matches your cluster set-up profile