hvalev/shiny-server-arm-docker

Speed up compilation of R-packages

hvalev opened this issue · 6 comments

Thanks. And nice image!

I was also about to add RUN mkdir -p ~/.R && echo "MAKEFLAGS = -j4" > ~/.R/Makevars to speed up compilation of R packages, both when building the image and for R itself when it's run through the container. But then, crash. Apparantly 4GB wasn't enough for running the final line:

RUN R -e "install.packages(c('shiny', 'Cairo'), repos='http://cran.rstudio.com/', clean = TRUE)"

Originally posted by @KasperSkytte in #20 (comment)

Pros: GitHub actions execution will be faster.
Cons: After deployment to sbc's such as a raspberry pi, building of packages may fail. -> remove the -j4 flag after building the image.

Update the 'build-it-yourself' part of the documentation.

Great images! What is a good default setting when runtime builds happen inside the container? What do you think about trying to pick it up like so NCPU=$(nproc --all) && JOBS=$((NCPU<3?1:NCPU - 1)) && echo $JOBS ie always use at least one (if there is only a single core), but also also try to leave one free for other things if doing runtime installs of source packages before or after service startup?

Thanks! Originally I created those images for Raspberry Pi (now the repository also includes an amd64 build), which sometimes only have 1GB of RAM. For that reason, building some libraries on multiple cores may throw an OOM. So I picked the most conservative approach to use a single core at the expense of longer runtime installs.

@KasperSkytte yes, interesting, I'm not sure what the default parallelism / nr of jobs are when building R packages from source, and if not specifying it explicitly (this SO post has a comment that suggests no to set it).

I guess a sensible default is nice primarily for when doing installs not at image build-time but "dynamically" during runtime (say dropping an app into a running server, which tries to install a package "if it doesn't already exist") and in the case where there are then no binary installs available (quite likely due to arm builds of binary packages probably not available yet through rpsm repositories).

Some of the newer rpi versions have more RAM (4/8 Gb) and 4 cores so would be nice to be able to make full use of those... the conservative approach may not be ideal for such boards I guess? I guess it is a trade-off... Perhaps one should test a little to find the best default, if there is one?

Whenever possible it would be best to install deps during container build-time I guess (and extend using a new Dockerfile, switching to USER root first when installing the dependencies?), but that may not always be an option and regardless it could be nice to also be able to install "run-time" while the service is running?

Shiny apps dependency detection at run-time appear to pose some challenges, which can be worked around sometimes even if/where/when dynamic installations during run-time are not recommended.

So, I don't think that I will be changing anything to the current setup. The ideas that you have about dynamically installing dependencies rely on some auto-detect function which is a problem of shiny-server rather than a docker container of shiny server. Also, I do know that newer versions have more RAM, however to be most inclusive, using a single core appears to work. However, you can always tune your installation by changing the init.sh file, which is run on every container start. There, for example, you can make for yourself various changes in the installation or configuration of shiny server, such as increasing the number of cores.