GenomicsAotearoa/metagenomics_summer_school

Update coverage normalisation script(s)

mlhoggard opened this issue · 6 comments

MH to replace coverage normalisation steps with updated script(s) (summarise_counts.py; summarise_counts.R)

Note: also check R dependencies for this. dynamic_require() doesn't seem to be correctly installing some libraries. This might be due to needing to manually select the CRAN location.

@mlhoggard What are the R packages we need for this ?

@DininduSenanayake . Ah, good point.

"dplyr" , "tibble" , "readr", "tidyr", "fuzzyjoin", "stringr", "matrixStats", "edgeR", "EDAseq"

edgeR and EDAseq are installed via BiocManager, so that's also required for that step. The script is supposed to install all these packages if they're not already available, but it seems to be buggy for those two I think.

Worst case, we could include a step in the docs that installs the dependencies. But I'll also do a test run with the workshop data, as from memory, depending on how it's run it won't require the R part of the script for this particular process anyway.

@mlhoggard I think you were using the normal R module, correct ?. If yes, use R-bundle-Bioconductor/3.13-gimkl-2020a-R-4.1.0 OR R-bundle-Bioconductor/3.15-gimkl-2022a-R-4.2.1 It has all but fuzzyjoin installed. I can add latter to that module later

@DininduSenanayake Ah brilliant! That's great to know thanks.