pyRaceID

RaceID in python.

Possible pipeline

1. Quality Controls

Script: QC.py

Input: coutt, coutc and coutb tables. The scripts asks you for the root of the name of the tables, supposed to be the same for the three tables.

Checks total sum per cell per table (sum-b, sum-c, sum-t), total number of genes per cell (gene-t; includes MT and Spike), fraction of mitochondria genes (mtfrac-t), number of spike-in per cell (spikein-t), overexpression per cell (overexpr), fraction of SvdB stressed genes (SvdBfracsg).

Output: root-QC.txt (table: rows: parameters desribed above; columns: cells)

Data visualization: I have preference for gnuplot.
To produce violing plots of the data for a given column go into gnuplot terminal and type (more info here: http://gnuplot.sourceforge.net/demo_cvs/violinplot.html):

col = 2
set table $violin
pl 'root-QC.txt' us col:(1) smooth kdensity
unset table
pl $violin us 2:1 w l lw 2 lc 1 noti
repl $violin us (-$2):1 w l lw 2 lc 1 noti
repl 'root-QC.txt' us (0):col:(.5) with boxplot ti col

2. Filtering

Based on QC one can decide which cells to filter. There is no script, since it can be vary from set to set.

3. Normalization

Methods:

Library size:
TMM: (edgeR package)
DESeq: (DESeq package)
Downsampling:
Spikein:
Deconvolution:

anna-alemany/pyRaceID

pyRaceID

Possible pipeline

1. Quality Controls

2. Filtering

3. Normalization