laurahspencer/DuMOAR

Perform DESeq2 analysis on RNASeq data

Opened this issue · 3 comments

Perform DESeq2 analysis on RNASeq data

Ran DESeq2 analysis on gene count matrix that Giles produced using STAR and all RNASeq data aligned to genome (not scrubbed of non-arthropoda reads). Here's my RMarkdown Notebook, RNASeq.html. A quick summary:

I removed quite a few low-frequency genes:

  • No. genes before filtering: 43,800
  • No. genes remaining after pre-filtering: 7,075
  • No. of genes dropped: 36,725
  • % of fragments remaining after pre-filtering: 95.551%

Below is a PCA of the two sign. principal components. A perMANOVA test indicates the treatments vary in multivariate space. There's also a heat map of the top 10% most variable genes (~700 genes). Differential expression analysis found a total of 40 DEGs. No significantly enriched biological processes. I suspect that improving our alignment will increase the no. of DEGs.

image

image

Here is a CSV file with DEG stats and annotation info. Of the 40 DEGs 11 were annotated: DEGs.csv

Here's gene names for those that are annotated (from DAVID):
image

NEW analysis approach:

  • @ggoetznoaa align all reads (NOT megan-scrubbed) to "original" trinity de-novo transcriptome using ALL data (NOT megan-scrubbed reads)
  • get @laurahspencer the gene count matrix
  • @laurahspencer blast the "original" transcriptome to Uniprot database
  • @laurahspencer run the gene count matrix through DEG analysis