RNAlysis: A Python repository from lbgbox

What is RNAlysis?

RNAlysis is a Python-based software for analyzing RNA sequencing data. RNAlysis allows you to build customized analysis pipelines suiting your specific research questions, going all the way from exploratory data analysis and data visualization through clustering analysis and gene-set enrichment analysis.

What can I do with RNAlysis?

Filter your gene expression matrices, differential expression tables, fold change data, and tabular data in general.
Normalize your gene expression matrices
Visualise, explore and describe your sequencing data
Find global relationships between sample expression profiles with clustering analyses and dimensionality reduction
Create and share analysis pipelines
Perform enrichment analysis with pre-determined Gene Ontology terms/KEGG pathways, or with used-defined attributes
Perform enrichment analysis on a single ranked list of genes, instead of a test set and a background set

To get an overview of what RNAlysis can do, read the tutorial and the user guide.

RNAlysis supports gene expression matrices and differential expression tables in general, and integrates in particular with Python's HTSeq-count and R's DESeq2.

How do I install it?

You can install RNAlysis via PyPI.

To install the full version of RNAlysis (includes additional features that might not work out-of-the-box on all machines), you should first install Microsoft Visual C++ 14.0 or greater (on Windows computers only), GraphViz, R, and kallisto. Then use the following command in your terminal window:

pip install RNAlysis[all]

To install the basic version of RNAlysis, use the following command in your terminal window:

pip install RNAlysis

You can also install RNAlysis with only some of the following additional features:

fastq - adapter trimming and RNA-seq transcript quantification of Fastq files
hdbscan - clustering analysis using the HDBSCAN method
single-set - single-set enrichment analysis using the XL-mHG test
randomization - improved performance for randomization tests

by calling the install command with one or more additional features inside the square brackets, separated by commas. For example:

pip install RNAlysis[fastq,single-set]

will install the basic version of RNAlysis, along with the fastq and single-set additional features.

How do I use it?

You can launch the RNAlysis software by typing the following command:

rnalysis-gui

Or through a python console:

>>> from rnalysis import gui
>>> gui.run_gui()

Alternatively, you can write Python code that uses RNAlysis functions as described in the user guide.

Dependencies

All of RNAlysis's dependencies can be installed automatically via PyPI.

Credits

How do I cite RNAlysis?

If you use RNAlysis in your research, please cite:

Teichman, G., Cohen, D., Ganon, O., Dunsky, N., Shani, S., Gingold, H., and Rechavi, O. (2022).
RNAlysis: analyze your RNA sequencing data without writing a single line of code. BioRxiv 2022.11.25.517851.
https://doi.org/10.1101/2022.11.25.517851

If you use the CutAdapt adapter trimming tool in your research, please cite:

Martin, M. (2011). Cutadapt removes adapter sequences from high-throughput sequencing reads.
EMBnet.journal, 17(1), pp. 10-12.
https://doi.org/10.14806/ej.17.1.200

If you use the kallisto RNA sequencing quantification tool in your research, please cite:

Bray, N., Pimentel, H., Melsted, P. et al.
Near-optimal probabilistic RNA-seq quantification.
Nat Biotechnol 34, 525–527 (2016).
https://doi.org/10.1038/nbt.3519

If you use the DESeq2 differential expression tool in your research, please cite:

Love MI, Huber W, Anders S (2014).
“Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.”
Genome Biology, 15, 550.
https://doi.org/10.1186/s13059-014-0550-8

If you use the HDBSCAN clustering feature in your research, please cite:

 L. McInnes, J. Healy, S. Astels, hdbscan:
Hierarchical density based clustering In:
Journal of Open Source Software, The Open Journal, volume 2, number 11. 2017
https://doi.org/10.1371/journal.pcbi.0030039

If you use the XL-mHG single-set enrichment test in your research, please cite:

Eden, E., Lipson, D., Yogev, S., and Yakhini, Z. (2007).
 Discovering Motifs in Ranked Lists of DNA Sequences. PLOS Comput. Biol. 3, e39.
https://doi.org/10.1371/journal.pcbi.0030039>doi.org/10.1371/journal.pcbi.0030039</a>

Wagner, F. (2017). The XL-mHG test for gene set enrichment. ArXiv.
https://doi.org/10.48550/arXiv.1507.07905

Development Lead

Guy Teichman: guyteichman@gmail.com

Contributors

Dror Cohen
Or Ganon
Netta Dunsky
Shachar Shani

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

lbgbox/RNAlysis