nf-GL_popstructure

Nextflow pipeline that calculates genotype likelihoods in angsd from a list of bamfiles and plots admixture through NGSadmix and PCAs through PCAngsd.

NOTE: please see the documentation for branch jn for further instructions

Quick start

Install nextflow (version >= 19.04)
Install Conda (version >= 4.10)

Download (git clone) this repository:

git clone https://github.com/FilipThorn/nf-GL_popstructure

Download and install PCAngsd

Run nextflowpipeline:

nextflow run GL_popstr.nf --bams /PATH/TO/BAMFILELIST/'*.list' --outdir /PATH/TO/RESULTS/ --chr_ref /PATH/TO/CHROMOSOMELIST

Input files

bam file list example:

/Absolute/PATH/IndvXXXX/IndvXXXX_sorted.bam
/Absolute/PATH/IndvXXXX/IndvXXXX_sorted.bam
/Absolute/PATH/IndvXXXX/IndvXXXX_sorted.bam
/Absolute/PATH/IndvXXXX/IndvXXXX_sorted.bam

Lables in plots are based on the subdirectory name

/results/Indv0001/Indv0001_sorted.bam

if you have a different file structure you can run the pipeline with the flag --skip_plots true and create your plots on your own

chrosome reference file example:
```
 chr1
 chr2
 chr3
 chr4
 chr5
```

Subset of scaffolds present in your bamfiles

HPC enviroment

Use of a HPC is recomended. Create a nextflow config profile that matches your cluster set-up profile

nylander/nf-GL_popstructure

nf-GL_popstructure

Quick start

Input files

HPC enviroment