Error in pairCells function
Opened this issue · 4 comments
Hi, sorry to bother you. When I run pairCells()
, I got this error as follow:
pairing <- pairCells(ATAC = ATAC_PCs,
RNA = RNA_PCs,
keepUnique = TRUE)
Running geodesic pairing in after chunking data ..
Error in Loadings(object = x)[i, j, drop = drop, ...] :
subscript out of bounds.
The cca components was calculated as follow:
PR_CCA <- RunCCA(object1 = PR_ATAC,
object2 = PR_RNA, assay1 = 'RNA',assay2 = 'RNA')
cca = PR_CCA@reductions[["cca"]]
dim(cca) # ATAC + RNA (rows), n components (columns)
#[1] 185838 50
head(rownames(cca)) # ATAC cells
#[1] "D19D013_Fovea#_TCCAGAACATCGGCCA-1" "D19D013_Fovea#_TTGGTCCAGGCCAGTA-1"
#[3] "D19D013_Fovea#_GGATGAGTCTGCGTCT-1" "D19D013_Fovea#_TCACCACTCTCATCCG-1"
#[5] "D19D013_Fovea#_GCAGATTCAGCAACCC-1" "D19D013_Fovea#_TAATTCCTCAACGTGT-1"
tail(rownames(cca)) # RNA cells
#[1] "TTTGGTTTCATGCGGC-1-16" "TTTGGTTTCTCTATGT-1-16" "TTTGTTGAGAGAAGGT-1-16"
#[4] "TTTGTTGCACTTCAAG-1-16" "TTTGTTGGTGCGACAA-1-16" "TTTGTTGGTGTGGACA-1-16"
isATAC <- grepl("#_",rownames(cca))
table(isATAC) # RNA vs ATAC
# FALSE TRUE
#109568 76270
ATACcells <- rownames(cca)[isATAC]
ATAC_PCs <- subset(cca, ATACcells)
RNAcells <- rownames(cca)[!isATAC]
RNA_PCs <- subset(cca, RNAcells)
head(ATAC_PCs)[1:6,1:6]
CC_1 CC_2 CC_3 CC_4 CC_5 CC_6
NLGN4X -175.87557 -69.1071538 28.36847 -13.187110 19.029639 117.37616
MID1 -70.56709 4.4671215 60.09425 -14.378456 -6.271517 65.65311
GPM6B 102.48711 -128.8414130 -49.13831 -3.038385 84.575660 147.98845
PIR 13.09029 -0.4235617 6.99728 9.343471 25.972613 41.82880
NHS -105.87166 -27.0766033 57.83406 -39.044979 -19.644737 36.95551
RAI2 -140.40811 2.9464300 28.50148 14.144115 -60.866979 64.11776
head(RNA_PCs)[1:6,1:6]
CC_1 CC_2 CC_3 CC_4 CC_5 CC_6
NLGN4X -175.87557 -69.1071538 28.36847 -13.187110 19.029639 117.37616
MID1 -70.56709 4.4671215 60.09425 -14.378456 -6.271517 65.65311
GPM6B 102.48711 -128.8414130 -49.13831 -3.038385 84.575660 147.98845
PIR 13.09029 -0.4235617 6.99728 9.343471 25.972613 41.82880
NHS -105.87166 -27.0766033 57.83406 -39.044979 -19.644737 36.95551
RAI2 -140.40811 2.9464300 28.50148 14.144115 -60.866979 64.11776
I don't know exactly what went wrong, do you have any suggestions? @vkartha Any help would be greatly appreciated.
Hi there - it looks like something funky happened when using the subset
function, such that instead of cells x components you ended up with genes x components. This shouldn't really happen since the subset function is also applied correctly from what I can tell. Can you print the initial cca output?:
cca = PR_CCA@reductions[["cca"]]
class(cca)
dim(cca)
head(cca)
From your rownames/dim run, it appears it is correct (i.e. it's a cells x components matrix with both ATAC/RNA cells as you were able to fetch the barcode names correctly).
Yes, I don't know why. Here is the initial cca output.
> class(cca)
[1] "DimReduc"
attr(,"package")
[1] "SeuratObject"
> dim(cca)
[1] 185838 50
> head(cca)
CC_1 CC_2 CC_3 CC_4 CC_5 CC_6
NLGN4X -175.87557 -69.1071538 28.36847 -13.187110 19.029639 117.37616
MID1 -70.56709 4.4671215 60.09425 -14.378456 -6.271517 65.65311
GPM6B 102.48711 -128.8414130 -49.13831 -3.038385 84.575660 147.98845
PIR 13.09029 -0.4235617 6.99728 9.343471 25.972613 41.82880
NHS -105.87166 -27.0766033 57.83406 -39.044979 -19.644737 36.95551
RAI2 -140.40811 2.9464300 28.50148 14.144115 -60.866979 64.11776
CC_7 CC_8 CC_9 CC_10 CC_11 CC_12
NLGN4X 11.9573408 -28.86680 -4.493107 -31.855822 37.7500152 17.570856
MID1 -12.4424792 -20.81513 -7.558554 -5.200529 23.2179418 15.953092
GPM6B -11.2129937 -45.57179 11.131815 -38.897719 43.0154007 23.310797
PIR 18.0212130 -27.52639 2.099186 -17.633395 4.6052149 8.781837
NHS 8.8137932 -38.19210 2.593103 -4.989704 28.5980414 18.571440
RAI2 -0.8604781 -12.23152 -39.227841 21.077158 -0.6035299 -1.561188
CC_13 CC_14 CC_15 CC_16 CC_17 CC_18 CC_19
NLGN4X -18.856155 -26.507412 23.689733 -4.443494 28.05954 25.60540 20.2890424
MID1 -4.814245 -9.342132 20.570563 -11.054530 33.73003 32.01375 19.4312204
GPM6B -16.302809 -9.071197 26.063990 -24.296506 33.62399 33.66729 27.3420750
PIR -12.878450 -5.102027 7.449413 -26.511379 36.14112 29.98860 22.3506079
NHS -13.323982 -14.640687 24.727332 -32.848442 37.74529 39.71971 31.2397395
RAI2 -0.659221 -8.569657 17.811945 -6.276230 16.86507 10.65270 0.1340127
CC_20 CC_21 CC_22 CC_23 CC_24 CC_25 CC_26
NLGN4X -1.983134 27.883625 33.2232983 55.82479 -49.29908 -5.2951149 -10.365470
MID1 2.384703 29.601429 10.1729030 57.13947 -41.19563 -3.4128501 -8.639101
GPM6B 10.441748 46.110170 35.4883586 62.14384 -47.03972 0.7777975 2.551255
PIR 4.054382 38.327409 -1.8813108 56.79130 -42.78739 8.6379586 1.922499
NHS 12.243104 50.699331 0.3253091 72.52380 -52.12074 -22.4695772 -11.761002
RAI2 1.478498 8.276096 -6.5202355 28.90956 -15.77897 0.7887253 -5.934125
CC_27 CC_28 CC_29 CC_30 CC_31 CC_32
NLGN4X 50.80880 0.8454094 -2.0199584 -24.747096 8.484102 -5.0625776
MID1 57.47852 -5.0214876 -18.8020772 -23.497848 -3.948140 -0.7944409
GPM6B 18.04253 -18.1110210 0.6712343 -7.700965 -15.250903 -13.1586061
PIR 12.86016 -29.2195386 -6.3778616 -20.628159 -3.442443 3.1870338
NHS 60.58913 2.6344999 -23.9572639 -19.400838 -3.720430 -1.0672745
RAI2 22.89196 -5.9624713 -8.3789681 -7.188969 10.579768 1.9609960
CC_33 CC_34 CC_35 CC_36 CC_37 CC_38
NLGN4X 1.7257754 44.4451443 -9.626847 -14.831342 17.882448 -7.379775
MID1 -0.7117021 7.3527930 6.908268 -17.934797 -5.632774 14.740645
GPM6B 15.6238366 27.8860378 -1.752459 -10.296515 13.731050 20.222309
PIR -13.5858133 -0.3524885 -5.862412 9.727464 -2.230001 13.669385
NHS 16.2295116 19.5320756 -7.337682 -17.486711 -3.432934 13.936299
RAI2 -5.9997071 7.0293173 4.645492 -13.044811 -4.813648 3.446370
CC_39 CC_40 CC_41 CC_42 CC_43 CC_44
NLGN4X 8.211748 25.830155 20.152715 -15.079576 2.124619 28.453037
MID1 4.843137 1.390995 -24.879273 -12.840269 -14.413698 -13.548004
GPM6B -14.490552 7.126468 -11.405305 8.985858 4.924844 7.206481
PIR -2.397024 -6.452931 -1.682569 -6.224321 -4.457405 -23.933984
NHS 5.827346 4.785578 -38.768441 -4.727424 -30.508757 -23.290311
RAI2 12.099638 10.137314 -16.286952 -2.858716 -3.472363 -5.400907
CC_45 CC_46 CC_47 CC_48 CC_49 CC_50
NLGN4X -6.9342924 10.4382923 -4.256122 -13.224380 22.845974 -7.615940
MID1 0.7470869 -3.2981609 -7.790328 -5.986730 7.282175 -2.038706
GPM6B 16.6984587 1.8428580 10.249337 7.522833 16.213984 -21.863452
PIR 12.5656339 0.1864377 -1.824583 -10.163720 14.170049 -17.627980
NHS 13.9001566 -26.7620663 8.517330 34.915100 20.970032 -51.066742
RAI2 1.2253744 -4.4216735 -1.060189 11.693987 -1.489973 -18.912411
Sorry for this question, I've solved this problem.
I used cca <- as.data.frame(Embeddings(PR_CCA, reduction = "cca"))
to extract cca components.
Now, this error no longer appears.
Error in Loadings(object = x)[i, j, drop = drop, ...] :
subscript out of bounds.
But I met a new error as follow:
pairing <- pairCells(ATAC = ATAC_PCs,
RNA = RNA_PCs,
keepUnique = TRUE)
Running geodesic pairing in after chunking data ..
Number of cells in bigger dataset: 17737
Number of cells in smaller dataset: 17327
Difference in cell count between 2 datasets: 410
Chunking larger dataset to match smaller datset ..
Chunk size n = 17327 cells
Total number of chunks: 2
Chunk # 1
No. cells in chunk: 17327
Attaching package: ‘igraph’
The following object is masked from ‘package:GenomicRanges’:
union
The following object is masked from ‘package:IRanges’:
union
The following object is masked from ‘package:S4Vectors’:
union
The following objects are masked from ‘package:BiocGenerics’:
normalize, path, union
The following objects are masked from ‘package:dplyr’:
as_data_frame, groups, union
The following objects are masked from ‘package:stats’:
decompose, spectrum
The following object is masked from ‘package:base’:
union
Attaching package: ‘FNN’
The following object is masked from ‘package:igraph’:
knn
Attaching package: ‘pracma’
The following objects are masked from ‘package:Matrix’:
expm, lu, tril, triu
Constructing KNN graph for computing geodesic distance ..
Computing graph-based geodesic distance ..
# KNN subgraphs detected:
3
Skipping subgraphs with either ATAC/RNA cells fewer than: 50 ..
Pairing cells for subgraph No. 1
Total ATAC cells in subgraph: 17211
Total RNA cells in subgraph: 16816
Subgraph size: 16816
Search threshold being used: 6727
Error in distmat(subgraph_ATAC_pcs, subgraph_RNA_pcs) :
X and Y must be numeric vectors or matrices.
@vkartha Sorry to bother you, but do you have any suggestions for this error as above?
Sorry for this question, I've solved this problem. I used
cca <- as.data.frame(Embeddings(PR_CCA, reduction = "cca"))
to extract cca components. Now, this error no longer appears. Error in Loadings(object = x)[i, j, drop = drop, ...] : subscript out of bounds.But I met a new error as follow:
pairing <- pairCells(ATAC = ATAC_PCs, RNA = RNA_PCs, keepUnique = TRUE) Running geodesic pairing in after chunking data .. Number of cells in bigger dataset: 17737 Number of cells in smaller dataset: 17327 Difference in cell count between 2 datasets: 410 Chunking larger dataset to match smaller datset .. Chunk size n = 17327 cells Total number of chunks: 2 Chunk # 1 No. cells in chunk: 17327 Attaching package: ‘igraph’ The following object is masked from ‘package:GenomicRanges’: union The following object is masked from ‘package:IRanges’: union The following object is masked from ‘package:S4Vectors’: union The following objects are masked from ‘package:BiocGenerics’: normalize, path, union The following objects are masked from ‘package:dplyr’: as_data_frame, groups, union The following objects are masked from ‘package:stats’: decompose, spectrum The following object is masked from ‘package:base’: union Attaching package: ‘FNN’ The following object is masked from ‘package:igraph’: knn Attaching package: ‘pracma’ The following objects are masked from ‘package:Matrix’: expm, lu, tril, triu Constructing KNN graph for computing geodesic distance .. Computing graph-based geodesic distance .. # KNN subgraphs detected: 3 Skipping subgraphs with either ATAC/RNA cells fewer than: 50 .. Pairing cells for subgraph No. 1 Total ATAC cells in subgraph: 17211 Total RNA cells in subgraph: 16816 Subgraph size: 16816 Search threshold being used: 6727
Error in distmat(subgraph_ATAC_pcs, subgraph_RNA_pcs) : X and Y must be numeric vectors or matrices.