merge-replicates.pl
: Use if there are multiple replicates for a single sample and the genotype files need to be merged.
- Input: Genotype files (one per chromosome per replicates for a sample)
- Output: Merged genotype files (one per chromosome)
filter-snps.pl
:
- Input:
- Sample ID
- Genotype files (one per chromosome for a sample)
- Output:
- Boundaries file
- Filtered SNP files (one per chromosome)
detect-boundaries:filter-snps.pl
- IN:
- Genotyped files:
- OUT:
- Boundary files:
sample-file/RIL_300.boundaries
, etc. - Filtered SNP files:
RIL_300.A01.filtered.snps
, etc.
- Boundary files:
This script takes a collection of *.boundaries
files and merges all of the boundaries. The bins resulting from the merge are written to a file: bins.tsv
. Boundary and bin stats (count, min size, max size, and mean size) are printed to the screen.
Sample Usage:
./merge-boundaries.pl \
--bam_file ~/git.repos/sample-files/bam/IMB211.good.bam \
--chr_list A01,A02,A03,A04,A05,A06,A07,A08,A09,A10 \
sample-file/RIL_*.boundaries
- IN:
- Boundary files:
sample-file/RIL_300.boundaries
, etc. - Representative
.bam
file (for extracting chromosome lengths) - (Optional) Chromosome list (for limiting analysis to a subset of sequences)
- Boundary files:
- OUT:
- Comprehensive list of bins and their locations:
bins.tsv
- Comprehensive list of bins and their locations:
detect-boundaries:extract-snp-subset.R
- IN:
- VCF summaries:
/Users/mfc/git.repos/snps-from-rils/merged.EC50.minflter.vcf/summaries.het_ratio_0_1.alt_ratmin_0_05.filters/merged.*.EC50.minflter.vcf.summary
- Bins:
bins.tsv
- VCF summaries:
- OUT:
sample-file/bins-snp.mid
detect-boundaries:get-genotypes-for-snp-subset.pl
- IN:
sample-file/bins-snp.mid
- Genotyped files:
- polyDB...
- list of Sample IDs
- OUT:
bin-genotype
Version 0.6.0