/TnPrep

Nextflow pipeline for preprocessing of TnSeq data

Primary LanguageNextflow

TnPrep

Description

TnPrep is a Tn-seq Nextflow pipeline for QC, mapping and counting of Himar1 mariner transposon insertion read sequencing data to positions within a supplied bacterial reference genome following the schema outlined in .

The output of this pipeline is individual count matrices in .wig format containing insertion counts mapped to all TA sites found within the reference genome and QC information in the form of a MultiQC report. This .wig count file is compatible with TRANSIT or other tools for downstream Tn-seq data processing and analysis.

Requirements

1. A POSIX compatible system (Linux, OS X, WSL (tested on Ubuntu), etc)

2.Java 11 or later (up to 18)

3. Install Nextflow ( >=22.10.7 ). Older versions may work, but are untested. ( this tutorial can be helpful to setup an environment to run Nextflow in Windows, just skip dev tool installations )

4. Install any of Docker, Podman, or Singularity ( tutorial can be found here )

5. Tn-seq FASTA files in gzip-compressed .fa.gz format

Running TnPrep

TnPrep can be automatically fetched or updated directly using the following command

nextflow pull MDHowe4/TnPrep

The pipeline can also be fetched by running directly on a file directory containing Tn-seq data in a compatible format. Running TnPrep requires supply of an input and output directory, as well as a reference genome in FASTA format

nextflow run MDHowe4/TnPrep -profile docker/singularity/podman \
                            --input </path/to/input_file_directory> \
                            --genome </path/to/fasta_DNA_reference> \
                            --output </path/to/output_directory>

Parameters:

--input: Path to the input files directory

--genome: Absolute path to the DNA reference file in Fasta format

--output: Path to the output file directory

NOTE: All files in the input file directory should be in the same file format for compatibility with this pipeline.

NOTE: The first time you execute this pipeline it may take some time grab TnPrep from the GitHub repository and download the necessary container image comprising the dependecies needed to successfully run TnPrep.

Pipeline Schema

tba

Software

Program Version
fastqc 0.11.9
cutadapt 4.1
bowtie2 2.5.1
multiqc 1.14
biopython 1.81