Variable Non-Overlapping Window CBS and HMM Intersect (VNOWCHI): A Copy Number Variant Calling Pipeline
VNOWCHI is a copy number variant calling pipeline that utilizes the Circular Binary Segmentaion and Hidden Markov Model algorithms to determine the ploidy and sex status of embryos.
Detailed information regarding VNOWCHI and the CBS and HMM algorithms can be found on the Wiki.
The scripts provided are tailored for use on OHSU's exacloud server using SLURM. For help with SLURM, please refer to ACC's tutorials.
To use scripts without SLURM, comment srun
and sbatch
commands and uncomment the commands beneath.
- Linux environment
- Java 8 (How to install Java) for Trimmomatic
- The following software tools installed:
- FastQC 0.10.1
- Trimmomatic 0.35
- FASTX-Toolkit 0.0.13
- BWA-MEM 0.7.9a-r786
- SAMtools 0.1.19-44428cd
- BEDtools 2.25.0
- FastUniq 1.1
- R version 3.5.0 with the following R packages installed:
- GenomicAlignments
- DNAcopy
- HMMcopy
- IRanges
- GenomicRanges
- dplyr
- ggplot2
Installation instructions can be found on the Wiki Installation page.
Detailed "How to Use" instructions are located on the Wiki How to Use page.
Fibroblast samples (5 scDNA-seq samples preferred):
/your/working/dir/CopyNumberPipeline/results/FIBROBLASTS/FASTQ
Single-ended samples:
/your/working/dir/CopyNumberPipeline/results/fastq/SE
Paired-ended samples:
/your/working/dir/CopyNumberPipeline/results/fastq/PE
sbatch PIPELINE_bins.sh
sbatch PIPELINE_VNOWCHI.sh
- CNV plots for all samples, includes all and individual chromosomes
- Mapping summary statistics for VNOWCHI_summary.txt
- Tabluar summary for all samples classified by ploidy and sex status
- Tabular summary for all embryos classified by ploidy and sex status based on samples
- Mapping summary statistics
VNOWCHI_Summary.txt
- Tabular summary of all individual samples CNV calls with embryo, blastomere, ploidy and sex classifications
CNV_<SE|PE>_<bin>.sampleSummary.txt
- Tabular summary of all embryos classified by ploidy and sex status
CNV_<SE|PE>_<bin>.embryoSummary.txt
- CNV plots for all samples by chromosome or by all chromosomes
<sampleName>_<chromosome>.png
<sampleName>_<all>.png
- Please note that R package dply will behave differently than intended if R package plyr is loaded. More info regarding the issue can be found here and here. Here's a possible solution from Stack Overflow if you get any errors.
- Might need to modify step 3 in
PIPELINE_bins.sh
andPIPELINE_VNOWCHI.sh
to ensure script will accept the provided fastq file name format pattern - If trying to use Rscripts in Rstudio, some scripts have issues. Ex. get_copy_number.R has no problems on server but does not work in RStudio, could be related to R version.
- Melissa Yan - extended the VNOWC pipeline to include CHI, classify samples/embryos, accommodate different genomes, and run on SLURM
- Nathan Lazar - original author of Variable Non-Overlapping Window CBS (VNOWC)
- Kristof Torkency - original author of CBS/HMM Intersect (CHI) pipeline
This project would not be possible without the support from the Chavez Lab, Carbone Lab, Adey Lab, and the Biostatistics & Bioinformatics Core:
- Chavez Lab:
- Brittany L. Daughtry
- Kelsey E. Brooks
- Jimi L. Rosenkrantz
- Shawn L. Chavez
- Carbone Lab:
- Nathan H. Lazar
- Brett Davis
- Lucia Carbone
- Adey Lab:
- Kristof A. Torkenczy
- Andrew Adey
- Biostatistics & Bioinformatics Core
- Suzi S. Fei
Daughtry, B. L., Rosenkrantz, J. L., Lazar, N. H., Fei, S. S., Redmayne, N., Torkenczy, K. A., Adey, A., Gao, L., Park, B., Nevonen, K.A., Carbone, L., Chavez, S. L. (2019). Single-cell sequencing of primate preimplantation embryos reveals chromosome elimination via cellular fragmentation and blastomere exclusion. Genome Research, 29(3), 367-382. doi:10.1101/gr.239830.118