/SCRABBLE

Single Cell RNA-Seq imputAtion constrained By BuLk RNAsEq data (SCRABBLE)

Primary LanguageMATLABGNU General Public License v3.0GPL-3.0

SCRABBLE

DOI

Single Cell RNA-Seq imputAtion constrained By BuLk RNAsEq data (SCRABBLE)

SCRABBLE has been implemented in R and MATLAB.

SCRABBLE imputes drop-out data by optimizing an objective function that consists of three terms. The first term ensures that imputed values for genes with nonzero expression remain as close to their original values as possible, thus minimizing unwanted bias towards expressed genes. The second term ensures the rank of the imputed data matrix to be as small as possible. The rationale is that we only expect a limited number of distinct cell types in the samples. The third term operates on the bulk RNA-Seq data. It ensures consistency between the average gene expression of the aggregated imputed data and the average gene expression of the bulk RNA-Seq data. We developed a convex optimization algorithm to minimize the objective function.

R Version

Install from CRAN

install.packages("SCRABBLE")

Install from Github

library(devtools)
install_github("software-github/SCRABBLE/R")

Install from source codes

Download source codes here and In R type:

install.packages(path_to_file, type = 'source', rep = NULL)

Where path_to_file would represent the full path and file name:

  • On Windows it will look something like this: "C:\Downloads\SCRABBLE.tar.gz".
  • On UNIX it will look like this: "~/Downloads/SCRABBLE.tar.gz".

Quick start

data_sc <- demo_data[[1]]
data_bulk <- demo_data[[2]]
data_true <- demo_data[[3]]

parameter <- c(1,1e-6,1e-4)

result <- scrabble(demo_data, parameter = parameter)

MATLAB Version

Quick start

Load the data

There are three datasets in the .mat file. There are the true data set, Drop-out data set, and the imputed data set by SCRABBLE.

load('demo_data.mat')

Prepare the data

We construct the data structure which is taken as one of the input of SCRABBLE.

data.data_sc = data_sc;
data.data_bulk = data_bulk;

Prepare the parameter for SCRABBLE

Set up the parameters used in example

parameter = [1,1e-6,1e-4];

Run SCRABBLE

dataRecovered = scrabble(data,parameter);

Visualize the results

gcf = figure(1);
set(gcf, 'Position', [100, 500, 1200, 300])
subplot(1,3,1)
imagesc(log10(data_true+1))
title('True Data')
axis off
subplot(1,3,2)
imagesc(log10(data_sc+1))
title('Drop-out Data')
axis off
subplot(1,3,3)
imagesc(log10(dataRecovered+1))
title('Imputed Data by SCRABBLE')
axis off

Help

Please feel free to contact Tao Peng (software.github@gmail.com) if you have any questions about the software.

Reference

Peng, Tao, et al. "SCRABBLE: single-cell RNA-seq imputation constrained by bulk RNA-seq data." Genome biology 20.1 (2019): 88.