Perform DESeq2 analysis on RNASeq data
Opened this issue · 3 comments
Ran DESeq2 analysis on gene count matrix that Giles produced using STAR and all RNASeq data aligned to genome (not scrubbed of non-arthropoda reads). Here's my RMarkdown Notebook, RNASeq.html. A quick summary:
I removed quite a few low-frequency genes:
- No. genes before filtering: 43,800
- No. genes remaining after pre-filtering: 7,075
- No. of genes dropped: 36,725
- % of fragments remaining after pre-filtering: 95.551%
Below is a PCA of the two sign. principal components. A perMANOVA test indicates the treatments vary in multivariate space. There's also a heat map of the top 10% most variable genes (~700 genes). Differential expression analysis found a total of 40 DEGs. No significantly enriched biological processes. I suspect that improving our alignment will increase the no. of DEGs.
Here is a CSV file with DEG stats and annotation info. Of the 40 DEGs 11 were annotated: DEGs.csv
Here's gene names for those that are annotated (from DAVID):
NEW analysis approach:
- @ggoetznoaa align all reads (NOT megan-scrubbed) to "original" trinity de-novo transcriptome using ALL data (NOT megan-scrubbed reads)
- get @laurahspencer the gene count matrix
- @laurahspencer blast the "original" transcriptome to Uniprot database
- @laurahspencer run the gene count matrix through DEG analysis