perishky/meffil

Problem generating normalized beta value for EPIC array data

Closed this issue · 3 comments

ox852 commented

I am analysing EPIC array data and when I pass the norm.objects to the following meffil.normalize.samples function,
norm.beta <- meffil.normalize.samples(norm.objects, cpglist.remove=qc.summary$bad.cpgs$name)
the error message is as follows:

Error in dimnames(ret) <- list(sites, names(norm.objects)) :
  length of 'dimnames' [1] not equal to array extent

I have tried the following but with no luck:

  1. Checked the length of norm.objects which is the same as my sample size
  2. Try using the make.names function to turn column names into 'syntactically valid names', from comment on https://stackoverflow.com/questions/12985653/what-does-length-of-dimnames-1-not-equal-to-array-extent-mean by names(norm.objects)<-make.names(names(norm.objects)) , returning the same error message when using the meffil.normalize.samples function
  3. As the column names cannot be inspected by colnames(norm.objects), it is difficult to tell if it is a problem with column naming in the norm.objects.

I wondered if it is a problem with EPIC array data? as the tutorial is 450K data and seems fine.

The sessionInfo() is as follows:

R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: .../software/OpenBLAS/0.3.5-GCC-8.2.0-2.31.1/lib/libopenblas_skylakexp-r0.3.5.so

locale:
[1] C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
 [1] meffil_1.1.1          preprocessCore_1.48.0 SmartSVA_0.1.3
 [4] RSpectra_0.16-0       isva_1.9              JADE_2.0-3
 [7] qvalue_2.16.0         gdsfmt_1.20.0         statmod_1.4.34
[10] quadprog_1.5-8        DNAcopy_1.60.0        fastICA_1.2-2
[13] lme4_1.1-23           Matrix_1.2-17         multcomp_1.4-13
[16] TH.data_1.0-10        survival_2.44-1.1     mvtnorm_1.1-1
[19] matrixStats_0.56.0    markdown_1.1          gridExtra_2.3
[22] Cairo_1.5-12          knitr_1.28            reshape2_1.4.4
[25] plyr_1.8.6            ggplot2_3.3.1         sva_3.32.1
[28] BiocParallel_1.18.1   genefilter_1.66.0     mgcv_1.8-28
[31] nlme_3.1-140          limma_3.42.2          MASS_7.3-51.4
[34] illuminaio_0.28.0

loaded via a namespace (and not attached):
 [1] Biobase_2.44.0       bit64_0.9-7          splines_3.6.0
 [4] assertthat_0.2.1     askpass_1.1          highr_0.8
 [7] stats4_3.6.0         blob_1.2.1           pillar_1.4.4
[10] RSQLite_2.2.0        lattice_0.20-38      glue_1.4.1
[13] digest_0.6.25        minqa_1.2.4          colorspace_1.4-1
[16] sandwich_2.5-1       XML_3.99-0.3         pkgconfig_2.0.3
[19] purrr_0.3.2          xtable_1.8-4         scales_1.1.1
[22] tibble_3.0.1         openssl_1.4.1        annotate_1.62.0
[25] farver_2.0.3         IRanges_2.18.1       ellipsis_0.3.1
[28] withr_2.2.0          BiocGenerics_0.30.0  mime_0.9
[31] magrittr_1.5         crayon_1.3.4         evaluate_0.14
[34] memoise_1.1.0        tools_3.6.0          lifecycle_0.2.0
[37] stringr_1.4.0        S4Vectors_0.22.0     munsell_0.5.0
[40] cluster_2.0.9        AnnotationDbi_1.46.1 base64_2.0
[43] compiler_3.6.0       rlang_0.4.6          grid_3.6.0
[46] RCurl_1.98-1.2       nloptr_1.2.2.1       labeling_0.3
[49] bitops_1.0-6         boot_1.3-22          gtable_0.3.0
[52] codetools_0.2-16     DBI_1.1.0            R6_2.4.1
[55] zoo_1.8-8            dplyr_0.8.1          bit_1.1-15.2
[58] clue_0.3-57          stringi_1.4.6        Rcpp_1.0.4.6
[61] vctrs_0.3.1          tidyselect_0.2.5     xfun_0.14

Thanks for helping!

My best guess right now is that normalization may be failing for one or more of the samples.

To determine if this is this case, perhaps try running the following code on the first four samples. It will either complete with no problems (indicates a problem with a sample outside the first four) or else it should genenerate a more informative error (because mclapply won't hide the error).

options(mc.cores=1)
norm.beta <- meffil.normalize.samples(norm.objects[1:4],cpglist.remove=qc.summary$bad.cpgs$name)

ox852 commented

My best guess right now is that normalization may be failing for one or more of the samples.

To determine if this is this case, perhaps try running the following code on the first four samples. It will either complete with no problems (indicates a problem with a sample outside the first four) or else it should genenerate a more informative error (because mclapply won't hide the error).

options(mc.cores=1)
norm.beta <- meffil.normalize.samples(norm.objects[1:4],cpglist.remove=qc.summary$bad.cpgs$name)

Thanks! After revealing the underlying error message, it was the problem of reading-in .idat files by read.idat. Once re-setting the working directory and sorting out file names of the .idats, problem solved!

Excellent! I haven't quite figured out a great way yet to generate useful error messages for errors produced within mclapply(). The only solution for now is to turn mclapply 'off' by setting mc.cores to 1.

I'll close this issue then!