/tabsat

Targeted Amplicon Bisulfite Sequencing Analysis Tool

Primary LanguagePerl

NEW VERSION OF TABSAT

Please check out our new version of TABSAT.
-> https://tabsat.ait.ac.at



TABSAT

TABSAT - Targeted Amplicon Bisulfite Sequencing Analysis Tool - is a tool for analyzing targeted bisulfite sequencing data generated on an Ion Torrent PGM / Illumina MiSeq. It performs

  • Quality Assessment
  • Alignment using Bismark
  • Result aggregation into a table
  • Visualization as lollipop plots

Available as

  • Fully configured Docker image Dockerfile - see usage information below.
  • Source code

Collaboration

Please contact us if you need help running your analyses. Also we have developed an extended version for our collaborators with the following additional features:

  • Interactive web-based visualization
  • Download FASTA of target regions
  • Strand specific CpGs
  • Automatic mapping of primers
  • Restriction enzyme positions
  • Start using web frontend
  • Pattern visualization and analysis

Publication

TABSAT is published:
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0160227

Example usage

${TABSAT} -l NONDIR -g hg19 -q 20 -m 10 -p 0.8 -r 0 -t target.csv -a tmap -o output_dir input.fastq

-t Targetlist in CSV format example [mandatory] - Strand can be "+", "-", "+/-"
-e Sequencing library - SE/PE (PE reads must be called *_1.fastq, *_2.fastq)
-g Genome (hg19, mm10)
-l Library mode of bisulfite experiment
-a [optional] Specify the aligner that should be used
-m [optional] This parameter is used for filtering reads that are shorter than the given threshold.
-q [optional] Bases that are below the given threshold are removed from the 3’ end of the reads (read trimming)
-p [optional] Percent of target covered by a read for pattern creation. This value specifies the percent of the target that needs to be covered by a read to include it for pattern analysis.
-r: [optional] Minimum number of mapped reads that need to be present at each CpG site.
-s: [optional] Sorted list of samples that is used to specify the order in the lollipop plots.
-o Output directory
-d Directory of inputfiles (absolute path); if not specified, the input files are added at the end [optional]

Examples

Test with input file directory

tabsat -l NONDIR -g hg19 -t target.csv -d test_input_dir -a tmap -o test_output_dir

Test with separate input files

tabsat -l NONDIR -g hg19 -t target.csv -o test_output_files xy.fastq abs.fastq

Test data

Test data is available here

Installation

$ tabsat/reference/prepareReference.sh
  • Prepare the CpG file
apt-get install p7zip-full
7za e tabsat/tools/ait/all_cpgs_only_pos_hg19.7z
7za e tabsat/tools/ait/all_cpgs_only_pos_mm10.7z
  • Install Perl modules
    • Cairo.pm
    • Switch.pm
  • Run 'install' script in tabsat folder (installs SAMtools, Bedtools) ./install

Run example

Command line

  • After installation go to tabsat/tools/zz_test
  • Execute
./test_tabsat_tmap.sh
  • Inspect output at tabsat/tabsat_test_output

Docker

  • Build the docker file
    docker build -t tabsat:v1 .

  • Run it
    docker run -t --name tabsat -d tabsat:v1

  • Connect to docker
    docker exec -ti tabsat /bin/bash

  • Stop container
    docker stop tabsat

  • Remove container
    docker rm tabsat

  • Remove image
    docker rmi tabsat:v1