/ZIKV_DMS_NS5_EvansLab

deep mutational scanning of ZIKV E protein +/- IFN with Matt Evans lab

Primary LanguageJupyter Notebook

Deep mutational scanning of ZIKV NS5 (RdRp) protein

Experiments were performed by Blake Richardson and Matt Evans. Analysis was performed by Jesse Bloom, David Bacsik, and Caroline Kikawa

Overview

The Evan's lab performed DMS on the ZIKV NS5 (RdRp) protein using a tiled subamplicon approach. A mutation's effect on ZIKV growth was assessed by comparing the passaged virus library to the original plasmid stock.

See results/summary/ for a markdown summary of the results for each tile over the genome (e.g., results/summary/dms_tile_1_analysis.md, etc).

See this file for the results from all tiles in a single table. The mutational scan was performed in two cell lines: Huh75 (human) and C636 (mosquito).

Running the analysis

Activate the ZIKV_DMS_NS5_EvansLab conda environment if it exists. Otherwise, create it from the environment.yml file:

conda env create -f environment.yml

And activate the environment:

conda activate ZIKV_DMS_NS5_EvansLab

The analysis is run by the snakemake pipeline detailed in Snakefile. This pipeline runs the Jupyter notebook dms_tile_analysis.py.ipynb for each tile using the information specified in the config.yml file, generating a markdown summary for each tile.

Run the pipeline with following command:

snakemake

If you've got access to the Hutch's cluster, run the bash script:

sbatch run_Snakemake.bash

Input data

The input data are in ./data/:

  • ./data/tile_*_amplicon.fasta: amplicons for each tile of the barcoded-subamplicon sequencing.

  • ./data/tile_*_subamplicon_alignspecs.txt: the alignment specs for the barcoded subamplicon sequencing for each amplicon.

  • ./data/tile_*_samplelist.csv: all the samples that we sequenced and the locations of the associated deep-sequencing data for each amplicon.

  • ./data/6WCZ.pdb is the 6WCZ PDB file of ZIKV NS5 bound to human STAT2.

  • ./data/NS5_STAT2_joined.pdb is a PDB file provided by Matt Evans that manually combines several other relevant PDBs.