Ribo-Seq Data Processing

Introduction

[Describe here what this pipeline does]

Requirements

This pipeline can be run using each of the following container methods

Method	Instructions
Singularity	docs.syslabs.io
Docker	docs.docker.com
Conda	docs.conda.io

Setup

Singularity

sudo singularity build singularity/pipeline Singularity

Then as the profile singularity specifies container = 'singularity/pipeline' use the following to execute:

nextflow run main.nf -profile singularity

Docker

docker build . -t pipeline-image

Then as the profile docker specifies container = 'pipeline-image:latest' use the following to execute:

nextflow run main.nf -profile docker

Conda

Create a conda definition yaml file eg. here

nextflow run main.nf -profile conda

Usage

Call the pipeline directly

nextflow run main.nf

Run with all the frills

bash scripts/run-w-frills <params-file> <profile name from nextflow.config>

Example

bash scripts/run-w-frills example_parameters.yml standard

Data Processing For RiboSeq.org

Automated processing of Ribo-Seq (and associated RNA-Seq) data for GWIPS-Viz and TRIPS-Viz

About Riboseq.org

This is a set of resources for the analysis and visualisation of publically available ribosome profiling data produced and maintained by various members of LAPTI lab in the School of Biochemistry and Cell Biology at Univeristy College Cork. These resources are well documented in their respective publications

GWIPS-Viz
- GWIPS-viz: 2018 update (2018). Nucleic Acids Res
- The GWIPS-viz Browser (2018). Current Protocols in Bioinformatics
- GWIPS-viz as a tool for exploring ribosome profiling evidence supporting the synthesis of alternative proteoforms (2015). Proteomics
- GWIPS-viz: development of a ribo-seq genome browser (2014). Nucleic Acids Res
Trips-Viz: a transcriptome browser for exploring Ribo-Seq data (2019). Nucleic Acids Res
RiboGalaxy: a browser based platform for the alignment, analysis and visualization of ribosome profiling data. RNA Biology-Viz
- ** Note: Ribogalaxy is being updated currently and functionality will be restored shortly (14-2-2022)**

Requirements

Outline

Produce Database Of All Available Ribosome Profiling Studies
Gather Metadata
Fetch Files and Infer Gaps in Metadata
Run Pipeline
Upload to GWIPS & TRIPS

1. Produce Database Of All Available Ribosome Profiling Studies

In recent years the rate at which ribosome profiling studies have been published has steadily increased. When the riboseq.org resources were initiatlly developed the number of available ribo-seq datasets was managable via manual inclusion. Here we put in place a method that records the details of relevant ribosome profiling data deposited in GEO

Initially manual searching of GEO and SRA were used along with ARGEOS. The outputs of each of these methods were colated to find the set of unique datasets.

2. Gather Metadata

GEO and SRA run tables contain valuable metadata that may be important for the processing and cateloging of the datasets. In this step we use python scripts to glean what we can from the information available

3. Fetch Files and Infer Gaps in Metadata

A common problem with reprocessing data for these resources is that the data is deposited in GEO and SRA with inconsistent metadata. In the stage of the process we carry out a number of steps to check for the relevant data in the provided metadata and where it is absent we infer it from the data itself. This relates to information such as cell type and treatment but also UMI position and adapter position/sequence.

4. Run pipeline

In this stage we use nextflow to process the fetched reads following the schema below

5. Upload to GWIPS and TRIPS

This stage uses the metadata to upload the processed files to the web resources in an automated fashion

JackCurragh/riboseq_data_processing

Ribo-Seq Data Processing

Introduction

Requirements

Setup

Singularity

Docker

Conda

Usage

Data Processing For RiboSeq.org

Automated processing of Ribo-Seq (and associated RNA-Seq) data for GWIPS-Viz and TRIPS-Viz

About Riboseq.org

Requirements

Outline

1. Produce Database Of All Available Ribosome Profiling Studies

2. Gather Metadata

3. Fetch Files and Infer Gaps in Metadata

4. Run pipeline

5. Upload to GWIPS and TRIPS