/LVIS_pipeline

Compile lentiviral vector integration sites from sequencing pipeline

Primary LanguageJupyter Notebook

Identification of lentiviral integration sites and analysis

This repository includes a few analysis pipelines for lentiviral integrome analysis.

  1. Sequencing pipeline for our qsLAM PCR assay
  2. Steps for profiling integration sites from scATAC-seq and scMultiome data
  3. Downstream analysis of vector integration sites
  4. A classifier of integration sites by integrome signatures

Sequencing pipeline for our qsLAM PCR assay

Dependencies

  • [fastqc >0.11.5]
  • [cutadapt]
  • [samtools >1.10]
  • [bwa >0.7.17]
  • [bedtools >2.25.0]
  • [R >3.6.2]

Usage

Download everything inside the folder qsLAM into the working directory. Create a folder called rawdata and put the paired-end reads files inside. Follow the steps-by-steps instructions.

Steps for profiling integration sites from scATAC-seq and scMultiome data

Prerequisites

  • [fastqc >0.11.5]
  • [samtools >1.10]
  • [bwa >0.7.17]

Usage

See the steps-by-steps instructions for identifying integration sites from ATAC-seq data.

Downstream analysis of vector integration sites

Prerequisites

  • [R >3.6.2]
  • [bedr 1.0.7]
  • [bedtools 2.29.0]

Usage

Useful functions are implemented in LVIS_functions.R. See examples and steps-by-steps instructions for the compilation of VISs across samples and other downstream analysis.

$ source("LVIS_functions.R")

A classifier of integration sites by integrome signatures

The code to build the catboost model for classifying high vs low abundance vis is in no_P1_classification.ipynb