error in summarizeResults
csimona opened this issue · 2 comments
Thanks for developing and maintaining HoneyBADGER! I get the following error when running this command (after preprocessing the data, analyzing it, and including the SNP info):
results <- hb$summarizeResults(geneBased=FALSE, alleleBased=TRUE)
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 15, 6
the algorithm identified 15 alterations, as length(hb$cnvs
Thanks!
Hi Simona,
Thanks for your patience.
It looks like the error is stemming from this code in hb$summarizeResults
:
rgs <- cnvs[["allele-based"]][["all"]]
retest <- results[["allele-based"]]
del.loh.allele.prob <- do.call(rbind, lapply(retest, function(x) x))
vi1 <- rowSums(del.loh.allele.prob > 0.75) > min.num.cells
del.loh.allele.prob <- del.loh.allele.prob[vi1, ]
names <- apply(as.data.frame(rgs), 1, paste0, collapse = ":")
rownames(del.loh.allele.prob) <- paste0("del.loh.", names[vi1])
cnvs[["allele-based"]][["del.loh"]] <<- rgs[vi1]
summary[["allele-based"]] <<- del.loh.allele.prob
colnames(del.loh.allele.prob) <- paste0("del.loh.allele.", colnames(del.loh.allele.prob))
df <- cbind(as.data.frame(rgs), avg.del.loh.allele = rowMeans(del.loh.allele.prob), del.loh.allele.prob)
because, as you aptly noted, rgs <- cnvs[["allele-based"]][["all"]]
has 15 alterations identified. But del.loh.allele.prob
has been filtered to only the alternations affecting more than min.num.cells
with greater than 75% posterior probability. In hindsight this should probably also be modified to take a parameter to allow users to have greater stringency on the posterior probability filter.
The fastest hack-y "fix" I believe is to just set min.num.cells = 0
instead of the default = 2.
The correction would be to have:
df <- cbind(as.data.frame(rgs[vi1]), avg.del.loh.allele = rowMeans(del.loh.allele.prob), del.loh.allele.prob)
Can you please double check that the following code works for you?
rgs <- hb$cnvs[["allele-based"]][["all"]]
retest <- hb$results[["allele-based"]]
del.loh.allele.prob <- do.call(rbind, lapply(retest, function(x) x))
min.num.cells <- 2
vi1 <- rowSums(del.loh.allele.prob > 0.75) > min.num.cells
del.loh.allele.prob <- del.loh.allele.prob[vi1, ]
names <- apply(as.data.frame(rgs), 1, paste0, collapse = ":")
rownames(del.loh.allele.prob) <- paste0("del.loh.", names[vi1])
colnames(del.loh.allele.prob) <- paste0("del.loh.allele.", colnames(del.loh.allele.prob))
df <- cbind(as.data.frame(rgs[vi1]), avg.del.loh.allele = rowMeans(del.loh.allele.prob), del.loh.allele.prob)
print(df)
If it works, I can make the appropriate corrections to the repo and acknowledge you in the commit message.
Thanks,
Jean
Hi Jean,
Thanks for the reply and for the fix; the code works for me.