metamaden/methyPre

there might be an error during filtering the undetected CpGs

wt2015-github opened this issue · 2 comments

Hi Sean, I found that preObj2 and detP had different orders of CpGs (rownames), maybe because the upstream normalization steps changed the CpG order (It would be good if you can double check this), therefore, the Line 50 of methyPre.R filtering by a TRUE/FALSE vector "failed" may not be correct. I suggest preObj3 <- preObj2[setdiff(rownames(preObj2), rownames(detP)[failed]),]. Please let me know if you update the package. Thanks!

Thanks for pointing out this potential issue.

I tested this using two full sample chips from TCGA-READ and using a 'rgset' RGChannelSet with minfi functions outside of the methyPre pipeline. I find:

  1. Row names preserved for detectionP(rgset) and mraw (from preprocessRaw)
  2. Row names are preserved for detectionP(rgset) and mswan (from either preprocessSWAN alone or preprocessllumina+preprocessSWAN with mSet=millumina);
  3. Row names are NOT preserved for detectionP(rgset) and mset from preprocessFunnorm (looks like order is changed but no probes are filtered)

So clearly you're correct, there needs to be a check for cg-ID with the detP failed probes filter in order for the methyPre wrapper to work for all available preprocessing functions.

Unfortunately, I couldn't get setdiff(), either the base or BiocGenerics function, to fix this issue.

I will change the code to filter on probe identity instead of the bare conditional.

MethyPre wrapper detP failed cg filter has been updated.