Scripts for tracing the trade with ddRADseq SNP data.
This GitHub is a work in progress (I apologise for the inconsistent numbering). All scripts excuted by Imogen Dumville, written by Imogen Dumville with aid of Jordi Salmona.
Scripts start on JSalmona Account for initial demultiplexing + alignment
syntenyanalysis.sh - finds which chromosome in reference is the X chromosome
Calls mummer.sh
script.2.demultiplex_data.sh - demultiplexing
Calls process_radtags.pg.sh
script.3.align_rad_data.sh - trimmming then aligns all the data to autosomes, X, mito and checks files
Calls trimmomatic_low_PE.sh index_genome.sh bwa_miseq2021.sh
script.3b.chromosomes.sh -indexes and blasts the mtDNA and Xchromosome as well as seperating the X from autosomes
Calls index_genome.sh blastmito.sh blastxchr.sh sepx_a_stats.sh
Scripts and workflow on IDumville account from samfiles
As follows;
01.alignmentstats_coverage.sh - getting coverage, extract statistics
Calls coverage.sh loci.sh (x)cov.plot.all.R loci.plot.R (via intemediate loci.sh, but no longer availble - I had an internet issue in my project so may have been lost here)
02.stacks_populations.sh - running gstacks (on seperate miseq/other datasets), populations (some seperation for lowreads / different localities, getting private alleles (PAs), unused script for error rate
Calls run.gstacks.sh run.population.sh (change Rs, minmac 2-3, hap exports, output file) run.indv.popn.sh run.indvseedpopn.sh
02b.ploidy.sh - getting ploidy from bams, check for contamination
Calls nQuire_ploidy.sh run_vanquish.sh (which calls vanquish_ploidy_plot.R)
03.ANGSD.sh - estimate GL and getting beagle, running ngsADMIX, producing likelihood file, plotting + geoplotting - IBD on this script obselete
Calls ANGSD.sh, ngsadmix.sh cuttingbeagle.sh ngsadmix_newbeagle.sh ngsadmix_plots.R ngsadmix_geoplots.R pcangsd_pango.sh angsd.he.indv.sh IBD_plotting.R
03b.IBD_script.sh - does IBD
Calls cuttingbeagle.sh; To run IBD, 1) remake list of samples to exclude 2) run beagle filtering 3) run pcagnsd with all 4) concat X + Auto 5) run IBD with all three datasets
04.locator.sh - runs locator, makes vcfs (indvidual and all lineages), plotting
Calls locator.sh iterateseedlocator.sh ALvcfmaking.sh indvlocvcfs.sh plot_locator_sbatch.sh zipping.sh
05.iterativeangsd.sh - to see if iterating angsd over minimal gmin calls changes he (it doesn't)
Calls angsd.he.indv.sh
script.iterate_p_alleles.sh - resampling PAs
Calls run.iterativepop.sh
08.assignmentmethods.sh - run BONE or rubias or assignPOP (tested but not in manuscript) plus make initial file from vcf
Calls run.assignment.sh