bcbio/bcbioR

Feedback in development branch

Opened this issue · 1 comments

Notes about bcbioR process
https://github.com/bcbio/bcbioR
“Set base project” and “Set RNAseq report folder” code do the same thing in terms of setting up file structure and RMD, this is confusing
Link to vignette is missing a “ and then does not work
Steps followed (based on ReadMe)
Skipped everything before “Downstream analysis” because Emma already did this
Modify “information.R”
Modify “QC/QC_nf-core.Rmd” and “params_qc_nf-core.R” because we are running nf-core output template
Make sure all the libraries listed in “QC_nf-core.Rmd” are installed
Knit “QC_nf-core.Rmd”
Create additional plots / colors / descriptions in QC report as needed
In QC_nf-core.Rmd
“load_metadata” chunk removes fastq and strandedness columns twice
Some typos
In “source_params” chunk, “manually” is misspelled
In “prepare metrics” chunk, “General Table of MultiQC” is misspelled
In “Read Metrics” section, all plots have “size=4” in geom_point call except for “tRNA/rRNA mapping rate”
In “PCA” section, the code indicates that PCs 1-5 should be assessed, but only looks at 1-4
Also, would be nice to give examples/options here for adding additional coloring/shapes to PCA plots (other metadata besides factor of interest)
Colors not consistent between heat map and other figures (everything else is yellow/blue, heat map is grey/black)
In “plot_genes_detected” chunk, title of second plot is “Total reads” rather than “Number of genes”

Notes about bcbioR process https://github.com/bcbio/bcbioR

  • “Set base project” and “Set RNAseq report folder” code do the same thing in terms of setting up file structure and RMD, this is confusing
  • Link to vignette is missing a “ and then does not work

Steps followed (based on ReadMe)

  • Skipped everything before “Downstream analysis” because Emma already did this

In QC_nf-core.Rmd

  • Modify “information.R”
  • Modify “QC/QC_nf-core.Rmd” and “params_qc_nf-core.R” because we are running nf-core output template
  • Make sure all the libraries listed in “QC_nf-core.Rmd” are installed
  • Knit “QC_nf-core.Rmd”
  • Create additional plots / colors / descriptions in QC report as needed
  • “load_metadata” chunk removes fastq and strandedness columns twice

DE

  • Modify “params_de.R”
  • Make sure all libraries listed in “DEG.Rmd” are installed
  • Make sure O2 is mounted
  • Knit “DEG.Rmd”
  • Create additional comparisons / plots in DEG report as needed

Dropbox

Copy to reports/QC:

  • Copy to reports/QC
  • “bcbio-se.rds”
  • “tximport-counts.csv”
  • “Rmd/R/html/figures”

Copy to reports/DE:

  • Normalized counts for all genes x all samples to reports/DE (csv)
  • DESeq2 results for all genes with annotation columns (csv)
  • Significant genes results file (subset of full results by p-value and LFC)
  • Significant genes results file with columns containing normalized count values for each sample

GitHub:
Copy to QC

  • All R and Rmd files
    Copy to DE
  • All R and Rmd files

End of Readme

Some typos

  • In “source_params” chunk, “manually” is misspelled
  • In “prepare metrics” chunk, “General Table of MultiQC” is misspelled
  • In “Read Metrics” section, all plots have “size=4” in geom_point call except for “tRNA/rRNA mapping rate”
  • In “PCA” section, the code indicates that PCs 1-5 should be assessed, but only looks at 1-4
  • Also, would be nice to give examples/options here for adding additional coloring to PCA plots (other metadata besides factor of interest)
  • Colors not consistent between heat map and other figures (everything else is yellow/blue, heat map is grey/black)
  • In “plot_genes_detected” chunk, plot titles and axes need to be updated to be “# genes” not “# reads”
  • When I use scale_color_cb_friendly() for 6 subjects, 2 of the output colors are almost identical. It's a very-slightly-blue grey vs grey

In DEG.Rmd:

  • [ ]Error in lfcShrink(de, coef = coef, type = "apeglm") : type='apeglm' requires installing the Bioconductor package 'apeglm' —> need to add this package to the libraries at the top

  • “volcano_plot” chunk:

  • Title should be drawn from comparison params, not hardcoded to be “tumor vs normal”

  • Colors on the plot do not work (do not match up with what is stated in paragraph above)

  • Why is pCutoff set to 1.345719e-03? Where does this number come from? It's not quite log10(0.05)

Other

  • Also, would be nice to give examples/options here for adding additional coloring/shapes to PCA plots (other metadata besides factor of interest)
  • Colors not consistent between heat map and other figures (everything else is yellow/blue, heat map is grey/black)
  • In “plot_genes_detected” chunk, title of second plot is “Total reads” rather than “Number of genes”