SMA Carrier Detection
The scripts in this repository can be used to implement the methods described in Larson et al. 2015 (https://www.ncbi.nlm.nih.gov/pubmed/26510457) for detecting SMA carriers. This technique utilizes both carrier probabilities and coverage at SMN1 loci to investigate SMA carrier status. (in beta)
run_smn_doc.sh
Calculate coverge per gene and at three SMN loci that distinguish SMN1 from SMN2
Input Files:
- bam_list # file with one line per sample (tab delimited: absolute bam path and whether ice/agilent was used)
- GATKjar # location of GATK jar installation
- reference # path to reference hg37
- sma_intervals # smn loci interval file
- output_dir #name of output directory
- picard #location of picard jar installation
- scripts_dir #location of sma scripts
merge_smn_doc.py
Merge SMN coverage results from all samples into one file
Input Files:
- bam_list # file with one line per sample (tab delimited: absolute bam path and whether ice/agilent was used)
- output_dir #name of output directory (keep consistent with previous scripts)
calculate_coef_var.py
Calculate theta, di, ri, pi
Input Files:
- cov_directory #name of output directory (keep consistent with previous scripts)
- bam_files # file with one line per sample (tab delimited: absolute bam path and whether ice/agilent was used)
- interval_of_interest #specify if should run on ice or agilent
- datamash #path to datamash
calculate_carrier_probability.R
Calculates the carrier probabilitiy and plots credible intervals
Input Files:
- the ice/agilent_sma_sample_stat.txt file
- output directory
Requirements:
GATK Picard datamash