GregorySchwartz/too-many-cells

Error when using matrix-path with cellranger .gz files

mcfefa opened this issue · 6 comments

I'm running this on the cluster, where our cluster admins created a singularity instance of the docker version of too-many-cells. I was able to start running this with an input for a .csv file, but ran into memory issues, so wanted to try on just a small subset with 2 samples. I tried loading the samples as shown in the workshop tutorial:

singularity run /share/data2/applications/singularity_images/too-many-cells.sif make-tree \
>   --matrix-path /share/lab/me/scRNAseq/sample1/outs/filtered_feature_bc_matrix/ \
>   --matrix-path /share/lab/me/scRNAseq/sample2/outs/filtered_feature_bc_matrix/ \
>   --output outTest20200420 \
>   > clustersTest.csv

and got the following error message:

Error in load(name, envir = .GlobalEnv) : 
  bad restore file magic number (file may be corrupted) -- no data loaded
Calls: sys.load.image -> load
In addition: Warning message:
file ‘.RData’ has magic number 'RDX3'
  Use of save versions prior to 2 is deprecated 
Execution halted
too-many-cells: readCreateProcess: R "-e" "cat(R.home())" "--quiet" "--slave" (exit 1): failed

The CellRanger datafiles are .gz files, but the documentation indicates that these can be read by too-many-cells. Please advise on what I should to do remedy this issue. I'm not trying to use the tool in R, but this seems like an R error.

Yes, too-many-cells can read matrix.mtx.gz, features.tsv.gz, and barcodes.tsv.gz files. too-many-cells uses R (in the docker image) for plotting some statistics and differential analysis after the tree is made. I suspect it has to do with how you made your singularity image. You would probably run into the same error with the csv file with that singularity image. What happens if you run the docker too-many-cells on the same data?

I first tried with a csv and was able to load the data and presumably start computation but ran into memory errors, which I'm troubleshooting with the HPC people at my institution.

Docker is not permitted on HPC systems due to security concerns, so I cannot test that specifically. I could however use pre-built haskell binaries. See #14.

Can you test on a different computer?

No.

Can we use a newer version of R (3.50 or greater)? If so, we can rebuild the singularity instance with it and see if that works.

The Dockerfile is available in the repo if you wish to test different R versions.

@mcfefa I have package too-many-cells for nix, I recommend trying that out (see the documentation). It's a reproducible derivation which should take care of all dependencies and only requires root once when installing nix.