CostaLab/scMEGA

PairCells error

chrismahony opened this issue · 11 comments

I am trying your vingette with a Seurat object which is scRNA + scATAC combined. But I get the following error message when I run PairCells(). My scMEGA part of my code is below:

coembed.sub <- RunDiffusionMap(coembed_harmon2, reduction = "harmony")

cols <- ArchR::paletteDiscrete(coembed.sub@meta.data[, "clusters_merge"])

p1 <- DimPlot(coembed.sub, group.by = "clusters_merge", label = TRUE,
reduction = "dm", shuffle = TRUE, cols = cols) +
xlab("DC 1") + ylab("DC 2")

p1

p2 <- DimPlot(coembed.sub, group.by = "cm_clusters", label = TRUE,
reduction = "dm", shuffle = TRUE, cols = cols) +
xlab("DC 1") + ylab("DC 2")

p2

DimPlot(coembed.sub, reduction = "dm",
group.by = "clusters_merge", split.by = "assay", cols = cols)

DimPlot(coembed.sub, reduction = "dm",
group.by = "cm_clusters", split.by = "assay", cols = cols)

df.pair <- PairCells(object = coembed.sub, reduction = "harmony",
pair.by = "assay", ident1 = "ATAC", ident2 = "RNA")

Getting dimensional reduction data for pairing cells...
Pairing cells using geodesic mode...
Constructing KNN graph for computing geodesic distance ..
Error in diag<-(*tmp*, value = 0) :
only matrix diagonals can be replaced

If you could help I would really appreciate it.
Thanks
Chris

Hi @chrismahony ,

Can you provide more information about your object coembed.sub?
Is the data from our vignettes?

Best,
Zhijian

Thanks for the reply.

No its not the data from your vingette. Its my own data. I integrated the RNA/ATAC as described in a Seurat Vingette: https://satijalab.org/seurat/articles/atacseq_integration_vignette.html

Here is the code I used to create my object, thanks for the help:
rna<-FindVariableFeatures(rna)
DefaultAssay(atac_cicero)<-'ATAC'
DefaultAssay(rna)<-'RNA'
gene.activities <- GeneActivity(atac_cicero, features = VariableFeatures(rna))
atac_cicero[["ACTIVITY"]] <- CreateAssayObject(counts = gene.activities)
DefaultAssay(atac_cicero) <- "ACTIVITY"
atac_cicero <- NormalizeData(atac_cicero)
atac_cicero <- ScaleData(atac_cicero, features = rownames(atac_cicero))

transfer.anchors <- FindTransferAnchors(reference = rna, query = atac_cicero, features = VariableFeatures(object = rna),
reference.assay = "RNA", query.assay = "ACTIVITY", reduction = "cca")
genes.use <- VariableFeatures(rna)

refdata <- GetAssayData(rna, assay = "RNA", slot = "data")[genes.use, ]
imputation <- TransferData(anchorset = transfer.anchors, refdata = refdata, weight.reduction = atac_cicero[["lsi"]],
dims = 2:30)
atac_cicero[["RNA"]] <- imputation
DefaultAssay(atac_cicero) <- "RNA"
DefaultAssay(rna) <- "RNA"

coembed <- merge(x = rna, y = atac_cicero)
coembed <- ScaleData(coembed, features = genes.use, do.scale = FALSE)
coembed <- RunPCA(coembed, features = genes.use, verbose = FALSE)
coembed <- RunUMAP(coembed, dims = 1:30)
DimPlot(coembed, group.by = c("orig.ident"))

coembed_harmon <- RunHarmony(coembed, group.by.vars = "assay",
assay.use = 'ATAC',
project.dim = FALSE)

coembed_harmon <- FindNeighbors(coembed_harmon, dims = 2:50, reduction = "harmony", verbose = FALSE)

coembed_harmon <- RunUMAP(coembed_harmon, dims = 2:50, reduction = "harmony")
DimPlot(coembed_harmon, group.by = "clusters_merge", reduction="umap")

DimPlot(coembed_harmon, group.by = "assay", reduction="umap")

Hi,

You can do the ATAC/RNA integration using the function CoembedData in scMEGA. It basically performs the same analysis as Seurat does.

See here for an example:
https://costalab.github.io/scMEGA/articles/myofibroblast-preprocessing.html

Maybe you can try this and see if the problem is solved.

Thanks.

I built the object using your function and then proceeded through your vignette: https://costalab.github.io/scMEGA/articles/myofibroblast-GRN.html

But when I try to RunChromVar() it stalls. I started running yesterday and it was still running this morning. I have have used RunChromVar() for several ATAC objects before, it normally takes some time but no longer than a couple of hours max. If you could help that would be great.

My full code is below:

obj.coembed <- CoembedData(
rna,
atac_cicero,
gene.activities,
weight.reduction = "harmony",
verbose = FALSE
)

p1 <- DimPlot(obj.coembed, group.by = "assay", shuffle = TRUE, label = TRUE)
p2 <- DimPlot(obj.coembed, group.by = "orig.ident", shuffle = TRUE, label = TRUE)
p1+p2

obj.coembed_harmon <- RunHarmony(
obj.coembed,
group.by.vars = c("orig.ident", "assay"),
reduction = "pca",
max.iter.harmony = 30,
dims.use = 1:30,
project.dim = FALSE,
plot_convergence = FALSE
)

obj.coembed_harmon <- FindNeighbors(obj.coembed_harmon, dims = 2:50, reduction = "harmony", verbose = FALSE)

obj.coembed_harmon <- RunUMAP(obj.coembed_harmon, dims = 2:50, reduction = "harmony")

p1 <- DimPlot(obj.coembed_harmon, group.by = "assay", shuffle = TRUE, label = TRUE)
p2 <- DimPlot(obj.coembed_harmon, group.by = "orig.ident", shuffle = TRUE, label = TRUE)
p1
p2

coembed.sub <- RunDiffusionMap(obj.coembed_harmon, reduction = "harmony")

cols <- ArchR::paletteDiscrete(coembed.sub@meta.data[, "clusters_merge"])

p <- DimPlot(coembed.sub, group.by = "clusters_merge", label = TRUE,
reduction = "dm", shuffle = TRUE, cols = cols) +
xlab("DC 1") + ylab("DC 2")

p

df.pair <- PairCells(object = coembed.sub, reduction = "harmony",
pair.by = "assay", ident1 = "ATAC", ident2 = "RNA")

sel_cells <- c(df.pair$ATAC, df.pair$RNA)
coembed.sub2 <- coembed.sub[, sel_cells]

options(repr.plot.height = 5, repr.plot.width = 10)
p3<-DimPlot(coembed.sub2, reduction = "dm",
group.by = "clusters_merge", split.by = "assay", cols = cols)
p3

obj.pair <- CreatePairedObject(df.pair = df.pair,
object = coembed.sub2,
use.assay1 = "RNA",
use.assay2 = "ATAC")

obj.pair <- AddTrajectory(object = obj.pair,
trajectory = c(2, 1),
group.by = "clusters_merge",
reduction = "dm",
dims = 1:3,
use.all = FALSE)

obj <- obj.pair[, !is.na(obj.pair$Trajectory)]

p <- TrajectoryPlot(object = obj,
reduction = "dm",
continuousSet = "blueYellow",
size = 1,
addArrow = FALSE)

p

pfm <- getMatrixSet(
x = JASPAR2020,
opts = list(collection = "CORE", tax_group = 'vertebrates', all_versions = FALSE)
)

obj <- AddMotifs(
object = obj,
genome = BSgenome.Mmusculus.UCSC.mm10,
pfm = pfm,
assay = "ATAC"
)

obj <- RunChromVAR(
object = obj,
genome = BSgenome.Mmusculus.UCSC.mm10,
assay = "ATAC"
)

My output from RunChromVar():
Computing GC bias per region
Selecting background regions
Computing deviations from background

And the Traceback:

Error in get(as.character(FUN), mode = "function", envir = envir) :
object 'as.SimpleList' of mode 'function' was not found
16.
selectChildren(pids[!fin], -1)
15.
parallel::mccollect(wait = TRUE)
14.
bpstart(BPPARAM, length(X))
13.
bpstart(BPPARAM, length(X))
12.
bplapply(X, FUN, ..., BPREDO = BPREDO, BPPARAM = BPPARAM)
11.
bplapply(X, FUN, ..., BPREDO = BPREDO, BPPARAM = BPPARAM)
10.
bplapply(peak_indices, compute_deviations_single, counts_mat = counts_mat,
background_peaks = background_peaks, expectation = expectation)
9.
bplapply(peak_indices, compute_deviations_single, counts_mat = counts_mat,
background_peaks = background_peaks, expectation = expectation)
8.
compute_deviations_core(counts(object), peak_indices, background_peaks,
expectation, colData = colData(object))
7.
.local(object, annotations, ...)
6.
chromVAR::computeDeviations(object = chromvar.obj, annotations = motif.matrix,
background_peaks = bg)
5.
chromVAR::computeDeviations(object = chromvar.obj, annotations = motif.matrix,
background_peaks = bg)
4.
RunChromVAR.ChromatinAssay(object = object[[assay]], genome = genome,
motif.matrix = motif.matrix, ...)
3.
RunChromVAR(object = object[[assay]], genome = genome, motif.matrix = motif.matrix,
...)
2.
RunChromVAR.Seurat(object = obj, genome = BSgenome.Mmusculus.UCSC.mm10,
assay = "ATAC")
1.
RunChromVAR(object = obj, genome = BSgenome.Mmusculus.UCSC.mm10,
assay = "ATAC")

I just found this:
stuart-lab/signac#1103

When I run :

library(BiocParallel)
register(SerialParam())

And then repeat RunChromVar() it completes no problem.

Ill continue with the Vingette and let you know if there are any problems.

Thanks
Chris

Hi

I have a few more issue, if you could help that would be really great.

res_persis_split <- SelectGenes(object = obj_persis_split,
labelTop1 = 0,
labelTop2 = 0)

Error in order(apply(mat, 1, which.max)) :
unimplemented type 'list' in 'orderVector1'
5.
order(apply(mat, 1, which.max))
4.
TrajectoryHeatmap(trajATAC, varCutOff = 0, maxFeatures = nrow(trajATAC),
pal = paletteContinuous(set = "solarExtra"), limits = c(-2,
2), name = "Chromatin accessibility", returnMatrix = TRUE)
3.
withCallingHandlers(expr, message = function(c) if (inherits(c,
classes)) tryInvokeRestart("muffleMessage"))
2.
suppressMessages(TrajectoryHeatmap(trajATAC, varCutOff = 0, maxFeatures = nrow(trajATAC),
pal = paletteContinuous(set = "solarExtra"), limits = c(-2,
2), name = "Chromatin accessibility", returnMatrix = TRUE))
1.
SelectGenes(object = obj_persis_split, labelTop1 = 0, labelTop2 = 0)

Although this doesn't work, I can continue and when I get to:

motif.matching_persis <- obj_persis_split@assays$ATAC@motifs@data
colnames(motif.matching_persis) <- obj_persis_split@assays$ATAC@motifs@motif.names
motif.matching_persis <-
motif.matching_persis[unique(df.p2g$peak), unique(tf.gene.cor_persis$tf)]

I get:
Error in motif.matching_persis[unique(df.p2g$peak), unique(tf.gene.cor_persis$tf)] :
invalid or not-yet-implemented 'Matrix' subsetting
3.
stop("invalid or not-yet-implemented 'Matrix' subsetting")
2.
motif.matching_persis[unique(df.p2g$peak), unique(tf.gene.cor_persis$tf)]
1.
motif.matching_persis[unique(df.p2g$peak), unique(tf.gene.cor_persis$tf)]

I dont seem to have the df.p2g object, should this have been calculated at an earlier step?

Thanks,
Chris

Hi @chrismahony ,

It seems that there is something wrong with gene selection.

df.p2g is an object containing the predicted peak-to-gene links that we will use to associate TFs to genes, since you didn't successfully generate the peak-to-gene links, it's not possible to run the following analysis.

I am wondering if it's possible for you to share the two objects with me so that I can test the package with your data.

Best,
Zhijian

Thanks for this. Unfortunately, my supervisor is not comfortable with sharing the data as its not yet published.

What I could do is subset one sample form the RNA and ATAC data set and send this for you to look at?

Or perhaps we could set a zoom call to look at the data?

Thanks
Chris

@chrismahony

Unfortunately, my supervisor is not comfortable with sharing the data as its not yet published.

This is totally understandable.

What I could do is subset one sample form the RNA and ATAC data set and send this for you to look at?
Or perhaps we could set a zoom call to look at the data?

This sounds nice. You can send me the link for data downloading via email: zhijian.li@rwth-aachen.de
Then I will check it out and try to run scmega on it.
Once I have some results, we can have a zoom call to discuss it.

Best,
Zhijian

Thanks for your understanding and help.

Ill prepare these and send them to you soon.

Thanks
Chris

Closed with no further comments.