Simple harmony call fails on v1.1.0
const-ae opened this issue · 5 comments
Hi,
I noticed that the new release of harmony
fails for small matrix inputs. The following used to work on version 1.0.3, but now throws an error in harmonyObj$cluster_cpp()
.
Y <- matrix(rnorm(10 * 2), nrow = 10, ncol = 2)
groups <- rep(c("a", "b"), length.out = nrow(Y))
int <- harmony::RunHarmony(Y, groups, nclust = 2, verbose = FALSE)
#> Transposing data matrix
#> Error in eval(expr, envir, enclos): Mat::submat(): indices out of bounds or incorrectly used
Created on 2023-10-25 with reprex v2.0.2
Note that the following still works:
Y <- matrix(rnorm(100 * 2), nrow = 100, ncol = 2)
groups <- rep(c("a", "b"), length.out = nrow(Y))
int <- harmony::RunHarmony(Y, groups, nclust = 2, verbose = FALSE)
#> Transposing data matrix
Created on 2023-10-25 with reprex v2.0.2
Also note that the latest version still ignores the verbose = FALSE
here and prints "Transposing data matrix"
.
Best,
Constantin
Hi @const-ae ,
Looks like an integer underflow due to the block_size parameter. We will address this and test with your snippet, though many use cases would involve more than 20 cells.
As far as the Transposing data matrix
, we will address that also, it is just we did not have a unit test that caught this scenario.
Thank you for your rigorous testing!
Hi
A similar error came up for me too, not sure if the error message can assist with debugging this issue.
Error: Mat::submat(): indices out of bounds or incorrectly used
9. stop(structure(list(message = "Mat::submat(): indices out of bounds or incorrectly used",
call = NULL, cppstack = NULL), class = c("std::out_of_range",
"C++Error", "error", "condition")))
8. harmonyObj$cluster_cpp()
7. harmonize(harmonyObj, max.iter.harmony, verbose)
6. tryCatchList(expr, classes, parentenv, handlers)
5. tryCatch({
check_legacy_args(...)
if (set.cores) {
prev.ncores.blas <- RhpcBLASctl::blas_get_num_procs() ...
4. RunHarmony.default(data_mat = embedding[, dims.use], meta_data = metavars_df,
vars_use = group.by.vars, return_object = FALSE, ...)
3. RunHarmony(data_mat = embedding[, dims.use], meta_data = metavars_df,
vars_use = group.by.vars, return_object = FALSE, ...)
2. RunHarmony.Seurat(oshu, verbose = F, group.by.vars = c("cohort"),
reduction = "pca", assay.use = "RNA", reduction.save = "harmony")
1. RunHarmony(oshu, verbose = F, group.by.vars = c("cohort"), reduction = "pca",
assay.use = "RNA", reduction.save = "harmony")
Can you give us a bit of information for the dataset? How big it is?
The dataset is 341 samples with 21784 genes, initialised with the following
NFEATURES = 3000
rna.seurat <- CreateSeuratObject(counts = raw_counts, project = "Sample_ID", min.cells = 3, min.features = 200, meta.data = sampleinfo2, assay="RNA")
rna.seurat <- NormalizeData(rna.seurat)
rna.seurat <- FindVariableFeatures(rna.seurat, selection.method = "vst", mean.cutoff = c(0.2, 10), dispersion.cutoff = c(1, 40), nfeatures = NFEATURES)
all.genes <- rownames(rna.seurat)
rna.seurat <- ScaleData(rna.seurat, features = all.genes)
n=30
rna.seurat <- RunPCA(rna.seurat, features = VariableFeatures(object = rna.seurat),
assay = "RNA",
npcs = n, verbose = F)
rna.seurat <- RunHarmony(rna.seurat, verbose = F,
group.by.vars = c("cohort"),
reduction = "pca", assay.use = "RNA", reduction.save = "harmony")
Thank you both for pointing to this bugs.
I would suggest to try out the latest version of the package from github.
Let us know here if there are more issues!