Proseg fails to read any transcripts from some .csv.gz files
Opened this issue · 7 comments
Hi,
I was coming across a lot of missing FOVs and saw you recently updated proseg to 1.0.6 which has some more operability with CosMx data and may help with the issue (#26 ), however I run into an error. I'll put the error and a sample of the transcript file below. Note: I managed to run this on 1.0.5 by editing the column names, so the data should be OK, also removing the --use-cell-initialization flag results in the same error.
(base) gordonbeattie@192 L1_SU500 % proseg -V
proseg 1.0.6
(base) gordonbeattie@192 L1_SU500 % proseg --cosmx L1_SU500_tx_file.csv.gz --use-cell-initialization
Using 8 threads
thread 'main' panicked at /Users/gordonbeattie/.cargo/registry/src/index.crates.io-6f17d22bba15001f/proseg-1.0.6/src/main.rs:511:18:
index out of bounds: the len is 0 but the index is 0
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
(base) gordonbeattie@192 L1_SU500 % gzip -cd L1_SU500_tx_file.csv.gz| head
fov,cell_ID,cell,x_local_px,y_local_px,x_global_px,y_global_px,z,target,CellComp
1,0,c_1_1_0,4243,67,66309.5474243164,55391.5182749431,7,Scd2,None
1,0,c_1_1_0,4243,1035,66309.4480832418,54423.7534205119,1,Tmsb4x,None
1,0,c_1_1_0,4242,1627,66308.8361422221,53831.2435150147,3,Atp1a2,None
1,0,c_1_1_0,4242,1898,66308.2679112752,53560.6026649475,1,Pcp4,None
1,0,c_1_1_0,4243,2704,66309.7858428955,52753.8736661275,1,Pfkm,None
1,0,c_1_1_0,4243,3836,66309.1977437337,51622.1841176351,1,Cst7,None
1,0,c_1_1_0,4243,3880,66309.7858428955,51577.8342882792,5,Mdh1,None
1,0,c_1_1_0,4242,3863,66308.2281748454,51595.2348709107,3,Camk2a,None
1,0,c_1_1_0,4242,3846,66308.2679112752,51611.9321187337,5,Rps9,None
Thanks in advance for any assistance!
All the best,
Gordon
Hi Gordon,
This seems that no transcripts were read for some reason. I'm not sure what's going on here, but can you confirm that the cell_ID
column has some non-zero values in this data?
Thanks for the response, I can confirm the cell_ID has some non-zero values, although most of them are 0. I'll put a few metrics below to give a little more insight.
> head(table(tx.list$Nanostring$cell_ID))
0 1 2 3 4 5
18825028 60665 53141 58098 63807 59214
> length(unique(tx.list$Nanostring$cell_ID))
[1] 1484
> length(unique(tx.list$Nanostring$fov))
[1] 169
> length(unique(tx.list$Nanostring$cell))
[1] 160243
Having the same issue trying to run on CosMX
proseg --cosmx Diana_HEM_CR_FF_EM_NR4A145koko315pA7_STJ_N_R1_tx_file.csv.gz
Using 192 threads
thread 'main' panicked at /home/fsegato/.cargo/registry/src/index.crates.io-6f17d22bba15001f/proseg-1.1.0/src/main.rs:521:18:
index out of bounds: the len is 0 but the index is 0
note: run with RUST_BACKTRACE=1
environment variable to display a backtrace
Having the same issue with CosMx Data
proseg --cosmx S22113961S22113960_tx_file.csv.gz --use-cell-initialization
Using 24 threads
thread 'main' panicked at /home/asmilags/.cargo/registry/src/index.crates.io-6f17d22bba15001f/proseg-1.1.3/src/main.rs:522:18:
index out of bounds: the len is 0 but the index is 0
stack backtrace:
0: rust_begin_unwind
1: core::panicking::panic_fmt
2: core::panicking::panic_bounds_check
3: proseg::main
note: Some details are omitted, run with RUST_BACKTRACE=full
for a verbose backtrace.
And the same with Xenium data processed with xenium_ranger version 3: proseg --xenium transcripts.csv.gz
Using 2 threads
thread 'main' panicked at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/proseg-1.0.0/src/main.rs:474:18:
index out of bounds: the len is 0 but the index is 0
note: run with RUST_BACKTRACE=1
environment variable to display a backtrace
Each of these reported errors is due to no transcripts being read by proseg, but I've not been able suss out where the loss might be occurring, and haven't been able to reproduce it.
If someone would be so kind as to email (or otherwise send) me a data to reproduce it, I'll fix this right away. I suspect that if this error happens with the full transcripts file, the first 10k lines or so should generate the same error and be small enough to email.