Docker provides a platform for pre-configured image, with all requisite softwares and packages of desired versions. In this tutorial, we demonstrate how to run Rstudio on docker, and further use this docker environment for single cell RNA-seq analysis.
Depending on your operating system, download and install corresponding docker on your local computer.
To facilitate the data transfer between our local computer and the docker environment, let's first create a local folder, where all future analysis results are stored. We create the folder on desktop: /Users/jil655/Desktop/sc_rnaseq_workshop
. You could create this folder anywhere you would like.
Start the docker software on your computer. Next, on your terminal, type the following command to download the pre-configured docker image for scRNA-seq:
docker run --rm -d -p 8787:8787 --name sc_rnaseq -e PASSWORD=1234 -v /Users/jil655/Desktop/sc_rnaseq_workshop:/home/rstudio liujihe/single_cell_rnaseq
Note:
- The first time you run this command, it will download the
liujihe/single_cell_rnaseq
docker image from DockerHub: https://hub.docker.com/r/liujihe/single_cell_rnaseq. This is the docker image we have prepared for you, which include all requisite R packages for the workshop. - What each option mean:
-p
: set the local port to access the docker container--name
: set a name for the container-v
: mount the container to a local host. The syntax islocal_folder_path:container_path
. For example, the container path here is set as/home/rstudio
. Before the:
, specify the path of local folder to store the result. This option allows synchronization of result from docker to the local folder, so that you don't lose result after deleting docker container.--rm
: automatically remove the existing docker container once it is closed. This prevents occupation of space with existing containers.
At this point, you can check whether the container is running, using the docker container ls
command. You should see a container ID and its associated properties.
docker container ls
Now that the container runs, open your web browser, and go to localhost:8787.
You will be prompted to log into Rstudio. The Username is rstudio
(this is by default), and the password is 1234
(the one you specified earlier in the docker run
command). You should now see a Rstudio interface in docker environment.
Libraries needed for this scRNA-seq workshop are already installed in the container. Now you just need to load all libraries.
library(Seurat)
library(tidyverse)
library(Matrix)
library(RCurl)
library(scales)
library(cowplot)
library(SingleCellExperiment)
library(AnnotationHub)
library(ensembldb)
Run sessionInfo
to make sure all libraries are loaded. Then you can perform analysis just like in your local Rstudio environment.
sessionInfo()
Create a project in Rstudio, and add folders data
, results
, figures
. Those folders are mounted on your local computer as well. Download the sequencing data to the data
folder on your local computer. You are now all set to start the actual analysis.
As you perform the analysis, newly-generated files will be automatically mounted to the local folder that you specified. When you finish the analysis, close the web browser. Then on your terminal, hit ctrl
+c
to end the program. Lastly, stop the container with the command:
docker container stop sc_rnaseq
Here we summarize some frequenctly used docker commands, all of which can be run on the command line.
docker images # list all docker images (use `-a` to show intermediate images)
docker rmi Image_ID # remove docker image
docker run --rm -p 8787:8787 -e PASSWORD=1234 rocker/rstudio # quick way to run an image
docker container ls # check status of container
docker container ls -a # check all containers (started and stopped)
docker container stop container_id # stop a container that is currently running
docker container start container_name # start an existing container
docker rm container_id # remove docker container
docker pull rocker/r-base # pull an image from DockerHub
Below are the steps to upload the image to docker hub. Currently, it is hosted at https://hub.docker.com/r/liujihe/single_cell_rnaseq. The last step shows how to pull the image to the local computer.
- login docker on command line
docker login
Follow username and password
- tag image
docker tag image_id dockerhub_username/image_name
- push image to the hub
docker push dockerhub_username/image_name
- Pull the public image to the local computer
docker pull liujihe/single_cell_rnaseq
- If you are interested in learning more about docker and reproducible research, you can follow this short tutorial.