/psichomics

Interactive R package to quantify, analyse and visualise alternative splicing data

Primary LanguageROtherNOASSERTION

psichomics Build Status AppVeyor Build Status codecov

Original article:

Nuno Saraiva-Agostinho and Nuno L. Barbosa-Morais (2018). psichomics: graphical application for alternative splicing quantification and analysis. Nucleic Acids Research.

Interactive R package with an intuitive Shiny-based graphical interface for alternative splicing quantification and integrative analyses of alternative splicing and gene expression based on The Cancer Genome Atlas (TCGA), the Genotype-Tissue Expression (GTEx) project, Sequence Read Archive (SRA) and user-provided data.

psichomics interactively performs survival, dimensionality reduction and median- and variance-based differential splicing and gene expression analyses that benefit from the incorporation of clinical and molecular sample-associated features (such as tumour stage or survival). Interactive visual access to genomic mapping and functional annotation of selected alternative splicing events is also included.

Differential splicing analysis in psichomics

Table of Contents

Install and start running

Bioconductor release

To install the package from Bioconductor, type the following in RStudio or in an R console:

install.packages("BiocManager")
BiocManager::install("psichomics")

GitHub version

To install and start using the GitHub version (that may be updated faster than its Bioconductor counterpart), follow the following steps:

  1. Install R
  2. Depending on your operative system, install:
  3. Open RStudio or an R console
  4. Install Bioconductor with:
    • install.packages("BiocManager")
  5. Install, load and start the visual interface with:
install.packages("devtools")
devtools::install_github("nuno-agostinho/psichomics")
library(psichomics)
psichomics()

Running the latest versions of psichomics in R 3.2 or newer

If you prefer to run psichomics in an older R version (3.2 or newer), run the following commands (note that the newest versions of psichomics were not tested in older R versions and some features may not be supported):

install.packages("devtools")
devtools::install_github("nuno-agostinho/psichomics", ref="R3.2")
library(psichomics)
psichomics()

Tutorials

The following case studies and tutorials are available and were based on our original article (currently in preprint):

Data input

Download TCGA Data

Pre-processed data of given tumours of interest can be automatically downloaded from TCGA. Subject- and sample-associated information, junction quantification and gene expression data from TCGA are supported.

Load GTEx Data

GTEx data needs to be manually downloaded from the GTEx Portal. Subject- and sample-associated data, junction quantification and gene expression data from GTEx are supported.

Load SRA Data

Although only select SRA projects are available to be automatically downloaded (based on pre-processed data from the recount2 project), other SRA projects can be manually downloaded, aligned using a splice-aware aligner and loaded by the user, as per the instructions in Loading SRA and user-provided RNA-seq data. Sample-associated files from SRA are also supported.

Load user-provided files

User-provided files (including subject-associated data, sample-associated data, junction quantification, alternative splicing quantification and gene expression) can be loaded as per the instructions in Loading SRA and user-provided RNA-seq data.

Splicing quantification

The quantification of each alternative splicing event is based on the proportion of junction reads that support the inclusion isoform, known as percent spliced-in or PSI (Wang et al., 2008).

An estimate of this value is obtained based on the the proportion of reads supporting the inclusion of an exon over the reads supporting both the inclusion and exclusion of that exon. To measure this estimate, both alternative splicing annotation and the quantification of RNA-Seq reads aligning to splice junctions (junction quantification) are required. While alternative splicing Human (hg19 and hg38 assemblies) annotation is provided within the package, junction quantification may be handed by the user or retrieved from TCGA, GTEx and SRA.

Gene expression processing

Gene expression can be normalised, filtered and log2-transformed in-app. Alternatively, the user can also provide its own pre-processed gene expression file.

Data grouping

Molecular and clinical sample-associated attributes allow to establish groups that can be explored in data analyses. For instance, TCGA data can be analysed based on smoking history, gender and race, among other attributes. Groups can also be manipulated (e.g. merged, intersected, etc.), allowing for complex attribute combinations, as well as saved and loaded between sessions.

Data Analyses

Dimensionality reduction

Perform principal and independent component analysis (PCA and ICA, respectively) on alternative splicing quantification and gene expression based on the previously created groups.

Differential splicing and gene expression analysis

Analyse alternative splicing quantification (based on variance and median statistical tests) and gene expression data based on the previously created groups.

Correlation between gene expression and splicing quantification

Test the correlation betweem the gene expression of a specific gene with the alternative splicing quantification of selected alternative splicing events.

Survival analysis

Perform Kaplan-Meier curves and Cox models based on sample-associated features. Additionally, study the impact of a splicing event (based on its quantification) or a gene (based on its gene expression) on patient survivability.

Gene, transcript and protein information

Examine the annotation and corresponding transcripts and proteins for a gene of interest. Relevant research articles are also presented here.

Feedback and support

All feedback on the program, documentation and associated material is welcome. Please send any suggestions and comments to:

Nuno Saraiva-Agostinho (nunoagostinho@medicina.ulisboa.pt)

Disease Transcriptomics Lab, Instituto de Medicina Molecular (Portugal)

Contributions

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

References

Wang, E. T., R. Sandberg, S. Luo, I. Khrebtukova, L. Zhang, C. Mayr, S. F. Kingsmore, G. P. Schroth, and C. B. Burge. 2008. Alternative isoform regulation in human tissue transcriptomes. Nature 456 (7221): 470–76.