/data-rnaseq-DmelIrbit

Data package containing transcriptomic profiles of IRBIT mutant fruit flies

Data package for transcriptomic profiles IRBIT mutant fruit flies

Sources

  • Experimental data were generated by Armaoutov et al. (2020)
    • Arnaoutov A, Lee H, Plevock Haase K, Aksenova V et al. IRBIT Directs Differentiation of Intestinal Stem Cell Progeny to Maintain Tissue Homeostasis. iScience 2020 Mar 27;23(3):100954. PMID: 32179478
    • GEO data set: GSE109862
  • Processing:
    • Sequencing reads were downloaded from SRA, at PRJNA432208
    • fastq files were check for adapter content, using FASTQC (no adapter sequence was found)
    • Reads were aligned on Drosophila melanogaster genome from Ensembl (Dmel_BDGP6.28) + 92 ERCC sequences, using STAR 2.7.1a, and then quantified by RSEM for abundance levels at gene and transcript levels.

Usage

Install the package, import the library and load the data set

devtools::install_github('ttdtrang/data-rnaseq-DmelIrbit')
library(data.rnaseq.DmelIrbit)
data(dmelirbit.rnaseq.gene)
dim(dmelirbit.rnaseq.gene@assayData$exprs)

The package includes 2 data sets, one for transcript-level counts/TPM and another for gene-level counts/TPM. Counts are non-integer estimate of expected_count by RSEM.

Steps to re-produce data curation

  1. cd data-raw
  2. Download all necessary raw data files which include
1.2M	data-raw/feature_attrs.rsem.transcripts.tsv
6.2M	data-raw/matrix.gene.expected_count.RDS
6.6M	data-raw/matrix.gene.tpm.RDS
13M	data-raw/matrix.transcripts.expected_count.RDS
11M	data-raw/matrix.transcripts.tpm.RDS
64K	data-raw/PRJNA432208_metadata_cleaned.tsv
48K	data-raw/starLog.final.tsv
  1. Set the environment variable DBDIR to point to the path containing said files
  2. Run the R notebook make-data-package.Rmd to assemble parts into ExpressionSet objects.

You may need to change some code chunk setting from eval=FALSE to eval=TRUE to make sure all chunks would be run. These chunks are disabled by default to avoid overwriting existing data files in the folder.