Error in iteration 2: arguments imply differing number of rows

Question

Error in iteration 2: arguments imply differing number of rows

royfrancis opened this issue 4 months ago · 2 comments

I am running SCEVAN like so:

library(SCEVAN)
dat <- readRDS("TN-B1-4031.rds")
SCEVAN::pipelineCNA(dat, sample = "TN-B1-4031", par_cores = 20, SUBCLONES = TRUE, plotTree = TRUE)

and it crashes with the following error:

Error in iteration 2: arguments imply differing number of rows: 0, 48

It only happens with some of my samples.

The data looks like this:

> dat[1:5,1:5]
         pal_2021_GSM4909285_1-0 pal_2021_GSM4909285_2-0
UQCR10                         8                       4
SPP1                           0                       0
HLA-DPB1                       0                       0
B2M                           14                      12
RAB14                          2                       1
         pal_2021_GSM4909285_3-0 pal_2021_GSM4909285_4-0
UQCR10                         0                      10
SPP1                           0                       0
HLA-DPB1                       0                       0
B2M                            1                      10
RAB14                          0                       2
         pal_2021_GSM4909285_5-0
UQCR10                         2
SPP1                           0
HLA-DPB1                       0
B2M                            2
RAB14                          2

The data file can be downloaded here. Here is another smaller dataset that fails.

Here is the full log:

Full log

[1] " raw data - genes: 13253 cells: 1482"
[1] "1) Filter: cells > 200 genes"
[1] "low data quality"
[1] "2) Filter: genes > 5% of cells"
[1] "2451 genes past filtering"
[1] "3) Annotations gene coordinates"
The number of elements of the gene set is less than the minimum allowed.
The number of elements of the gene set is less than the minimum allowed.
The number of elements of the gene set is less than the minimum allowed.
The number of elements of the gene set is less than the minimum allowed.
...
<This line is repeated hundreds of times>
...
The number of elements of the gene set is less than the minimum allowed.
The number of elements of the gene set is less than the minimum allowed.
The number of elements of the gene set is less than the minimum allowed.
The number of elements of the gene set is less than the minimum allowed.
[1] "found 30 confident non malignant cells"
[1] "2378 genes annotated"
[1] "4) Filter: genes involved in the cell cycle"
[1] "2205 genes past filtering "
[1] "5)  Filter: cells > 5genes per chromosome "
[1] "6) Log Freeman Turkey transformation"
[1] "A total of 174 cells, 2205 genes after preprocessing"
[1] "7) Measuring baselines (confident normal cells)"
[1] "8) Smoothing data"
[1] "9) Segmentation (VegaMC)"
[1] "10) Adjust baseline"
[1] "11) plot heatmap"
[1] "found 48 tumor cells"
[1] "time classify tumor cells:  23.0540297031403"
[1] "found 2 subclones"
percentage_cells_subsclone_1 percentage_cells_subsclone_2 
                   0.4166667                    0.5833333 
[1] "Segmentation of subclone :  1"
[1] "Segmentation of subclone :  2"
$`wu-2021_4465_subclone1`
   Chr     Start       End Alteration segm.mean
2    1  16974502  31944856         -2 -0.215354
43   8  11795573  54147901          2  0.095991
46   8  97644179 144428563          2  0.122258
58  11  20363685  73761137          2  0.156543
63  12    752593  96269835         -2 -0.358043
68  12 112013316 123633766          2  0.159426
70  13  36998816  77019143         -2 -0.229887
71  13  95677139 114305817         -2 -0.238055
82  16  47154387  89968060         -2 -0.225194
85  17   7240014  57006768         -2 -0.301897
89  18   3247481   9285985          2  0.183345
91  18  56597208  80033949          2  0.137885
92  19    571277   4670370          2  0.048158
95  19  41956681  54194536         -2 -0.180862
98  20  44496221  63891545         -2 -0.201054
99  21  17593653  46665124         -2 -0.305551
13   2 177212563 197516737         -1 -0.064736
15   3   3148992  11771350          1  0.189506
16   3  13549131  48504826         -1 -0.062771
19   3 139517434 158829719          1  0.107402
47   9    121038  35732395          1  0.127720
80  16   4461680  24572863          1  0.186913

$`wu-2021_4465_subclone2`
   Chr     Start       End Alteration segm.mean
4    1 150265399 167937040          2  0.244055
53  12    752593  14803540         -2 -0.254675
55  12  46358189 107713167         -2 -0.116435
58  13  19633681 114305817         -2 -0.121179
68  16   3018445  30085377          2  0.244846
2    1  19640554  39487177         -1 -0.140271
13   3  33798352  52239260         -1 -0.105971
28   6  30345131  33711727         -1 -0.044950
50  11  62433542  65862026          1  0.188088

$`wu-2021_4465_clone`
    Chr     Start       End Alteration segm.mean
5     1 203795654 246768137          2  0.316165
10    2    264140  88750935          2  0.319686
14    2 201071433 241686991          2  0.236119
21    3 184174689 195584205         -2 -0.211273
22    4   1166932 155953917         -2 -0.238754
26    5    218241  55535050          2  0.193638
29    5 135570679 163519336         -2 -0.207735
34    6  42744481 169254044          2  0.213288
39    7   1152071  66958551          2  0.423589
49    9  99222064 137615360         -2 -0.269008
54   10  80131682 133373689         -2 -0.161538
56   11    307631   6619461         -2 -0.448406
61   11 102520508 134253370         -2 -0.088353
72   14  34981957 105480170         -2 -0.192639
75   15  24823637  58933653         -2 -0.194629
87   17  68035519  82730328          2  0.054674
93   19   5891276  17323301         -2 -0.202012
101  22  28794555  50628173         -2 -0.263656

Error in iteration 2: arguments imply differing number of rows: 0, 48

I have also tried reinstalling SCEVAN to the current version as of 8-Aug-2024.

R version 4.3.2 (2023-10-31)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
SCEVAN_1.0.1

Answer 1 · 2024-10-07T10:30:09.000Z

Hi @royfrancis,
The error is due to the very low quality of the sample, in which only 2,451 genes pass the filters and the tumor fraction appears to be close to zero. Only 48 tumor cells were identified, so I don't recommend using the subclonal analysis I see from the log.

Thanks.

Answer 2 · 2024-10-07T16:03:25.000Z

Thanks for looking into this! 🙏🏼 This was helpful feedback.