GregorySchwartz/too-many-cells

TooManyCellsR

pedriniedoardo opened this issue · 14 comments

Hello,
I am trying to run the example code for tooManyCells function from the TooManyCellsR package in R.
I am using windows 10.
I have followed the docker workflow. I can confirm the container works:

PS C:\Windows\system32> docker run -it --rm -v "/home/username:/home/username" gregoryschwartz/too-many-cells:0.1.5.0 -h
too-many-cells, Gregory W. Schwartz. Clusters and analyzes single cell data.

Usage: too-many-cells (make-tree | interactive | differential | diversity |
                      paths)

Available options:
  -h,--help                Show this help text

Available commands:
  make-tree
  interactive
  differential
  diversity
  paths

In R I follow the example code from the TooManyCells function and specify the docker argument

library(TooManyCellsR)
input <- system.file("extdata", "mat.csv", package="TooManyCellsR")
inputLabels <- system.file("extdata", "labels.csv", package="TooManyCellsR")
df = read.csv(input, row.names = 1, header = TRUE)
mat = Matrix::Matrix(as.matrix(df), sparse = TRUE)
labelsDf = read.csv(inputLabels, header = TRUE)
res = tooManyCells(docker = "gregoryschwartz/too-many-cells:0.1.5.0",mat, labels = labelsDf
                    , args = c( "make-tree"
                                , "--no-filter"
                                , "--normalization", "NoneNorm"
                                , "--draw-max-node-size", "40"
                                , "--draw-max-leaf-node-size", "70"
                    )
)

I can create all the object but res, with the following error:

Error in wrap.url(file, load.image.internal) : File not found
In addition: Warning messages:
1: In dir.create(output, recursive = TRUE) : 'out' already exists
2: In system2("docker", args = c(dockerArgs, args, autoArgs), stdout = TRUE) :
  running command '"docker" run -i --rm -v C:\Users\pedri\AppData\Local\Temp\Rtmp8sTiO7:C:\Users\pedri\AppData\Local\Temp\Rtmp8sTiO7 -v C:\Users\pedri\Documents\out:C:\Users\pedri\Documents\out gregoryschwartz/too-many-cells:0.1.5.0 make-tree --no-filter --normalization NoneNorm --draw-max-node-size 40 --draw-max-leaf-node-size 70 --matrix-path C:\Users\pedri\AppData\Local\Temp\Rtmp8sTiO7 --output C:\Users\pedri\Documents\out --labels-file C:\Users\pedri\AppData\Local\Temp\Rtmp8sTiO7/labels.csv' had status 125

any idea how to fix the issue?

here my session:

> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] TooManyCellsR_0.1.1.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.3       lattice_0.20-38  png_0.1-7        tiff_0.1-5       grid_3.6.2       imager_0.42.1    magrittr_1.5     stringi_1.4.3   
 [9] rlang_0.4.2      Matrix_1.2-18    bmp_0.3          tools_3.6.2      stringr_1.4.0    purrr_0.3.3      igraph_1.2.4.2   jpeg_0.1-8.1    
[17] compiler_3.6.2   pkgconfig_2.0.3  readbitmap_0.1.5

I unfortunately don't have Windows to test the fix on, but try the newest version of TooManyCellsR on github. I believe it was a filepath issue with Windows which should be fixed.

Hello @GregorySchwartz , thank you for looking into that.

I tried what you suggested but I still got the same error

> res <- tooManyCells(docker = "gregoryschwartz/too-many-cells:0.2.2.0",
+                    mat,
+                    labels = labelsDf,
+                    args = c( "make-tree"
+                                 , "--no-filter"
+                                 , "--normalization", "NoneNorm"
+                                 , "--draw-max-node-size", "40"
+                                 , "--draw-max-leaf-node-size", "70"
+                     )
+ )
Error in wrap.url(file, load.image.internal) : File not found
In addition: Warning messages:
1: In dir.create(output, recursive = TRUE) : 'out' already exists
2: In system2("docker", args = c(dockerArgs, args, autoArgs), stdout = TRUE) :
  running command '"docker" run -i --rm -v C:\Users\pedri\AppData\Local\Temp\RtmpQVhJ7w:C:\Users\pedri\AppData\Local\Temp\RtmpQVhJ7w -v C:\Users\pedri\Documents\out:C:\Users\pedri\Documents\out gregoryschwartz/too-many-cells:0.2.2.0 make-tree --no-filter --normalization NoneNorm --draw-max-node-size 40 --draw-max-leaf-node-size 70 --matrix-path C:\Users\pedri\AppData\Local\Temp\RtmpQVhJ7w --output C:\Users\pedri\Documents\out --labels-file C:\Users\pedri\AppData\Local\Temp\RtmpQVhJ7w/labels.csv' had status 125

here to confirm the effective new version of the R package

> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] TooManyCellsR_0.1.1.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.3        compiler_3.6.2    imager_0.42.1     prettyunits_1.0.2 remotes_2.1.0     tools_3.6.2       testthat_2.3.1   
 [8] digest_0.6.25     pkgbuild_1.0.6    pkgload_1.0.2     memoise_1.1.0     lattice_0.20-38   pkgconfig_2.0.3   png_0.1-7        
[15] rlang_0.4.5       Matrix_1.2-18     igraph_1.2.4.2    cli_2.0.2         rstudioapi_0.10   curl_4.3          bmp_0.3          
[22] readbitmap_0.1.5  stringr_1.4.0     withr_2.1.2       desc_1.2.0        fs_1.3.1          devtools_2.2.1    rprojroot_1.3-2  
[29] grid_3.6.2        glue_1.3.1        R6_2.4.1          jpeg_0.1-8.1      processx_3.4.1    fansi_0.4.1       sessioninfo_1.1.1
[36] purrr_0.3.3       callr_3.4.0       magrittr_1.5      backports_1.1.5   ps_1.3.0          ellipsis_0.3.0    usethis_1.5.1    
[43] assertthat_0.2.1  tiff_0.1-5        stringi_1.4.6     crayon_1.3.4   

I also pulled the newer docker version but doesn't seem to fix the error

PS C:\Windows\system32> docker run -it --rm -v "/home/username:/home/username" gregoryschwartz/too-many-cells:0.2.2.0 -h
too-many-cells, Gregory W. Schwartz. Clusters and analyzes single cell data.

Usage: too-many-cells (make-tree | interactive | differential | diversity |
                      paths)

Available options:
  -h,--help                Show this help text

Available commands:
  make-tree
  interactive
  differential
  diversity
  paths

Did you increase the memory usable by docker (https://docs.docker.com/config/containers/resource_constraints/)? It's possible docker ran out of memory and crashed silently, not producing output.

just tried to increase the specs of docker, still get the same error.

PS C:\Windows\system32> docker info
Client:
 Debug Mode: false

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 0
 Server Version: 19.03.5
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: b34a5c8af56e510852c35414db4c1f4fa6172339
 runc version: 3e425f80a8c931f88e6d94a8c831b9d5aa481657
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.19.76-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 2
 Total Memory: 5.809GiB
 Name: docker-desktop
 ID: JZHY:T7HC:FIAQ:LDD2:K7Z5:2OJX:WMPU:PGF3:OKIK:T5S5:5H3K:4HQ7
 Docker Root Dir: /var/lib/docker
 Debug Mode: true
  File Descriptors: 33
  Goroutines: 50
  System Time: 2020-03-03T18:00:48.121623967Z
  EventsListeners: 3
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false
 Product License: Community Engine

considering the size the sample data I assume it should be more than enough?!

Are you able to run too-many-cells without the wrapper?

docker run -i --rm -v C:\Users\pedri:C:\Users\pedri gregoryschwartz/too-many-cells:0.2.2.0 make-tree --no-filter --normalization NoneNorm --draw-max-node-size 40 --draw-max-leaf-node-size 70 --matrix-path C:\Users\pedri\path\to\mat --output out --labels-file C:\Users\pedri\path\to\labels.csv

I wonder if it has to do with the different notation on Windows. Try:

docker run -i --rm -v C:\Users\pedri\path\to\mat.csv:/mat.csv -v C:\Users\pedri\path\to\labels.csv:/labels.csv -v C:\Users\pedri\path\to\out:/out gregoryschwartz/too-many-cells:0.2.2.0 make-tree --no-filter --normalization NoneNorm --draw-max-node-size 40 --draw-max-leaf-node-size 70 --matrix-path /mat.csv --output /out --labels-file /labels.csv

If that works, then it's because of the different filesystem of the docker.

I guess there is some issue with the windows path, here is the output of the first command

PS C:\Windows\system32> docker run -i --rm -v C:\Users\pedri:C:\Users\pedri gregoryschwartz/too-many-cells:0.2.2.0 make-tree --no-filter --normalization NoneNorm --draw-max-node-size 40 --draw-max-leaf-node-size 70 --matrix-path C:\Users\pedri\Documents\training\bioinformatic\TooManyCells\test\mat.csv --output C:\Users\pedri\Documents\training\bioinformatic\TooManyCells\test --labels-file C:\Users\pedri\Documents\training\bioinformatic\TooManyCells\test\label.csv
C:\Program Files\Docker\Docker\resources\bin\docker.exe: Error response from daemon: invalid mode: \Users\pedri.
See 'C:\Program Files\Docker\Docker\resources\bin\docker.exe run --help'.

the second suggestion start the process but stop after 10%

PS C:\Windows\system32> docker run -i --rm -v C:\Users\pedri\Documents\training\bioinformatic\TooManyCells\test\mat.csv:/mat.csv -v C:\Users\pedri\Documents\training\bioinformatic\TooManyCells\test\label.csv:/labels.csv -v C:\Users\pedri\Documents\training\bioinformatic\TooManyCells\test:/out gregoryschwartz/too-many-cells:0.2.2.0 make-tree --no-filter --normalization NoneNorm --draw-max-node-size 40 --draw-max-leaf-node-size 70 --matrix-path /mat.csv --output /out --labels-file /labels.csv
Planning leaf colors [=====>..............................................]  10%too-many-cells: /labels.csv: openBinaryFile: inappropriate type (is a directory)

Try:

docker run -i --rm -v C:\Users\pedri\Documents\training\bioinformatic\TooManyCells\test:/ gregoryschwartz/too-many-cells:0.2.2.0 make-tree --no-filter --normalization NoneNorm --draw-max-node-size 40 --draw-max-leaf-node-size 70 --matrix-path /mat.csv --output /out --labels-file /labels.csv

still some error

PS C:\Windows\system32> docker run -i --rm -v C:\Users\pedri\Documents\training\bioinformatic\TooManyCells\test:/ gregoryschwartz/too-many-cells:0.2.2.0 make-tree --no-filter --normalization NoneNorm --draw-max-node-size 40 --draw-max-leaf-node-size 70 --matrix-path /mat.csv --output /out --labels-file /labels.csv
C:\Program Files\Docker\Docker\resources\bin\docker.exe: Error response from daemon: invalid volume specification: '/host_mnt/c/Users/pedri/Documents/training/bioinformatic/TooManyCells/test:/': invalid mount config for type "bind": invalid specification: destination can't be '/'.
See 'C:\Program Files\Docker\Docker\resources\bin\docker.exe run --help'.

Try a different mount point:

docker run -i --rm -v C:\Users\pedri\Documents\training\bioinformatic\TooManyCells\test:/work gregoryschwartz/too-many-cells:0.2.2.0 make-tree --no-filter --normalization NoneNorm --draw-max-node-size 40 --draw-max-leaf-node-size 70 --matrix-path /work/mat.csv --output /work/out --labels-file /work/labels.csv

yay!

PS C:\Windows\system32> docker run -i --rm -v C:\Users\pedri\Documents\training\bioinformatic\TooManyCells\test:/work gregoryschwartz/too-many-cells:0.2.2.0 make-tree --no-filter --normalization NoneNorm --draw-max-node-size 40 --draw-max-leaf-node-size 70 --matrix-path /work/mat.csv --output /work/out --labels-file /work/labels.csv
Painting sketches [======================================>................]  70%
No labels for leaves based on chosen clumpiness method, skipping clumpiness plot.
Packing up [==============================================================] 100%cell,cluster,path
cell4,1,1/0
cell5,1,1/0
cell1,2,2/0
cell2,2,2/0
cell3,2,2/0
cell6,2,2/0

Well the good news is that you have a way to run it, but the bad news is that work should be done on TooManyCellsR before it can be used on Windows due to the disparity between Windows and the docker system.

thank you very much @GregorySchwartz !
I will tell people in the lab that in case they want to run your algorithm in R they would have to go with linux (where it works perfectly). if they want to keep using win they would need to work in docker. thank you very much for the support.