perishky/meffil

meffil.normalize.samples function needs raw values

Closed this issue · 2 comments

Dear parishky,
I am trying to run the function meffil.normalize.samples in the server because on my PC I do not have enough memory to run it. To this end, I uploaded the qc.objects generated on my computer to the server and continued with the standard pipeline.
The issue error when I tried to run the mentioned function:

norm.beta <- meffil.normalize.samples(norm.objects, cpglist.remove=badcpgs)
Error in read.idat(paste(basename, "_Grn.idat", sep = ""), verbose = verbose) :
Filename does not exist:data//DELCODE/Epigenetics/idat/205707890015_R05C01_Grn.idat.gzdata//DELCODE/Epigenetics/idat/205707890015_R05C01_Grn.idat

why does this function need access to the raw idat values? Is there a way to avoid this?
thank you very much,
Rafael

Hi Rafael,

Yes, unfortunately meffil.normalize.samples() requires access to the raw data. 'norm.objects' only retains information about how to normalize the raw data, not the raw data itself.

To reduce memory requirements, you could avoid loading the normalised methylation matrix into R by having meffil.normalize.samples() save the matrix to a file, e.g.

meffil.normalize.samples(norm.objects, cpglist.remove=badcpgs, gds.filename="norm-beta.gds")

After normalization, you could then:

  1. copy "norm-beta.gds" to your server, load the matrix into R using meffil.gds.methylation() and perform whatever analyses you'd like on the matrix (loading this file does not require access to the raw data),

  2. generate the normalisation summary and perform an EWAS by applying meffil.normalization.summary() and meffil.ewas() on your PC to the gds file by setting "beta=norm-beta.gds" (i.e. these functions will analyse the methylation matrix in the gds file without loading the entire matrix into R).

Hope that helps! Any questions let me know.

Matt

That is great! Then I don't even need to work on the server.
Thank you a lot!