GregorySchwartz/too-many-cells

Expected runtime

Opened this issue · 1 comments

Hello,

Great program so far! We are looking forward to our analysis using AnnoSpat and the spatial module in too-many-cells. I am wondering if it is typical for only one core to be used in the spatial processing? I am running a Docker container of v3.0.1.0 in WSL2 given 50GB of 64GB RAM with an 8 core i7-7700 CPU. The input is ~137,000 cells from CODEX imaging of 23 markers assigned using AnnoSpat:

docker run --memory=55g -v "$HOME:$HOME"
gregoryschwartz/too-many-cells:3.0.1.0 spatial
--matrix-transpose
-z QuantileNorm
-z TfIdfNorm
-m /home/smith6jt/AnnoSpat/measurements.csv
-j /home/smith6jt/AnnoSpat/spatial.csv
-o /home/smith6jt/outputdir/full_marker/tmc
-l /home/smith6jt/outputdir/full_marker/trte_labels_ELM_spleen.csv
--mark "ALL"

There are 31 expected cell types and for now each relationship file is taking almost 30min so all combinations will take quite a long time for one sample at this pace. Perhaps there is something I can do to improve?

Thanks!

Thank you for your interest in our tools! Yes, the speed is not ideal. In my benchmarking I remember the main limiter being based on the sparse matrix library sparse-linear-algebra. I had a branch switching to eigen but at the time that library was missing some necessary functions.