/polygloty

Primary LanguageHTMLMIT LicenseMIT

Polyglot programming for single-cell analysis

This book is a collection of notebooks and explanations for the workshop on Polyglot programming for single-cell analysis given at the scverse Conference 2024. For more information, please visit the workshop page.

Installation

For the best polyglot experience on most platforms, we recommend renv to manage the R and Python dependencies, see below for instructions.

Alternatively for Linux, you can use Pixi to manage your development environment. Environment creation support for Pixi on Windows and MacOS ARM is currently limited for R packages. Installation of the R dependencies in Pixi is more difficult, because Pixi does not support post-link scripts and the bioconda channel for bioconductor packages does not yet support osx-arm64.

In a clean Linux shell without any active Python (deactivate) or Conda environments (conda deactivate), you can install all dependencies with the following command:

pixi install -a

For MacOS ARM and Windows, we recommend using Docker.

Linux

To run the pipeline on Linux, use the following command:

pixi run pipeline

Docker

To run the pipeline with Docker, use the following command. The image is ~5GB and the pipeline can require a lot of working memory ~20GB, so make sure to increase the RAM allocated to Docker in your settings. Note that the usecase/ and book/ folders are mounted to the Docker container, so you can edit the scripts and access the data.

docker pull berombau/polygloty-docker:latest
docker run -it -v $(pwd)/usecase:/app/usecase -v $(pwd)/book:/app/book berombau/polygloty-docker:latest pixi run pipeline

renv

First time setup

To install the R and Python dependencies, use the following command. Start a new R session with R or run within RStudio:

install.packages("renv")
renv::restore()

On MacOS ARM, you will need extra configuration and patience to be able to build some of the packages. The Docker approach is recommended for MacOS ARM.

Adding new packages

If you want to install a new R package, use the following command:

renv::install("anndata")

If you want to install a new Python package, use the following command:

reticulate::py_install(c("rich>=13.7,<13.8", "anndata>=0.10.8,<0.11", "numpy>=1.24,<2", "scanpy>=1.10,<2", "mudata>=0.3,<0.4", "rpy2>=3.4,<4", "jupyter"))

After installing a new package, use the following command to update the renv.lock file:

renv::snapshot()

Using the environment

To use the environment, use the following command:

# ensure that jupyter can also use the renv environment
source renv/python/virtualenvs/renv-python-3.12/bin/activate
quarto preview

Or to render the slides:

source renv/python/virtualenvs/renv-python-3.12/bin/activate
quarto render

Extra

Building the Docker image yourself

To edit and build the Docker image yourself, use can use the following command.

docker build -t polygloty-docker .
docker run -it -v $(pwd)/usecase:/app/usecase -v $(pwd)/book:/app/book polygloty-docker pixi run pipeline

To publish it to Docker Hub, use the following command. It's a multi-architecture image that supports both ARM and AMD64, so make sure to assign enough memory (~32 GB) and disk resources (~100 GB) to Docker to build it.

docker login
docker buildx build --push --platform linux/amd64,linux/arm64 --tag berombau/polygloty-docker:latest .

More info on Pixi and Docker can be found here.