/deepcell-tf

Deep Learning Library for Single Cell Analysis

Primary LanguagePythonOtherNOASSERTION

DeepCell Banner

Build Status Coverage Status Documentation Status Apache 2.0 PyPI version Python Versions

deepcell-tf is a deep learning library for single-cell analysis of biological images. It is written in Python and built using TensorFlow 2.

This library allows users to apply pre-existing models to imaging data as well as to develop new deep learning models for single-cell analysis. This library specializes in models for cell segmentation (whole-cell and nuclear) in 2D and 3D images as well as cell tracking in 2D time-lapse datasets. These models are applicable to data ranging from multiplexed images of tissues to dynamic live-cell imaging movies.

deepcell-tf is one of several resources created by the Van Valen lab to facilitate the development and application of new deep learning methods to biology. Other projects within our DeepCell ecosystem include the DeepCell Toolbox for pre and post-processing the outputs of deep learning models, DeepCell Tracking for creating cell lineages with deep-learning-based tracking models, and the DeepCell Kiosk for deploying workflows on large datasets in the cloud. Additionally, we have developed DeepCell Label for annotating high-dimensional biological images to use as training data.

Read the documentation at deepcell.readthedocs.io.

For more information on deploying models in the cloud refer to the the Kiosk documentation.

For TensorFlow 1.X or Python 2.7 support, please use deepcell 0.7.0 or earlier.

Examples

Raw Image Tracked Image
Raw Image Tracked Image

Getting Started

Install with pip

The fastest way to get started with deepcell-tf is to install the package with pip:

$ pip install deepcell

Install with Docker

There are also docker containers with GPU support available from DockerHub. run one of our existing images from Docker Hub. To run the library locally on a GPU, you will need to make sure the latest version of nvidia-docker and CUDA are installed. Alternatively, Google Cloud Platform (GCP) offers prebuilt virtual machines preinstalled with Cuda, Docker, and the NVIDIA Container Toolkit.

Once nvidia-docker is installed, run the following command:

# Start a GPU enabled container on one GPUs
docker run --gpus '"device=0"' -it --rm \
    -p 8888:8888 \
    -v $PWD/notebooks:/notebooks \
    -v $PWD/data:/data \
    vanvalenlab/deepcell-tf:0.9.0-gpu

This will spin up a docker container with deepcell-tf installed and start a jupyter session using the default port 8888. This command also mounts a data folder ($PWD/data) and a notebooks folder ($PWD/notebooks) to the docker container so it can access data and Juyter notebooks stored on the host workstation. For any saved data or models to persist once the container is shut down, or be accessible outside of the container in general, it must be saved in these mounted directories. The default port can be changed to any non-reserved port by updating -p 8888:8888 to, e.g., -p 8080:8888. If you run across any errors getting started, you should either refer to the deepcell-tf for developers section or raise an issue on GitHub.

For examples of how to train models with the deepcell-tf library, check out the following notebooks:

DeepCell Applications and DeepCell Datasets

deepcell-tf contains two modules that greatly simplify the development and usage of deep learning models for single cell analysis. The first is deepcell.datasets, which a collection of biological images that have single-cell annotations. These data include live-cell imaging movies of fluorescent nuclei (approximately 10,000 single-cell trajectories over 30 frames), as well as static images of whole cells (both phase and fluorescence images - approximately 75,000 single cell annotations). The second is deepcell.applications, which contains pre-trained models (fluorescent nuclear and phase/fluorescent whole cell) for single-cell analysis. Provided data is scaled so that the physical size of each pixel matches that in the training dataset, these models can be used out of the box on live-cell imaging data. We are currently working to expand these modules to include data and models for tissue images. Please note that they may be spun off into their own GitHub repositories in the near future.

DeepCell-tf for Developers

deepcell-tf uses nvidia-docker and tensorflow to enable GPU processing. If using GCP, there are pre-built images which come with CUDA, docker, and nvidia-docker pre-installed. Otherwise, you will need to install docker, nvidia-docker, and CUDA separately. These instructions set up a docker container so that you can directly change the deepcell-tf library itself and have those changes reflected within the container

Build a local docker container, specifying the tensorflow version with TF_VERSION

git clone https://github.com/vanvalenlab/deepcell-tf.git
cd deepcell-tf
docker build --build-arg TF_VERSION=2.3.1-gpu -t $USER/deepcell-tf .

Run the new docker image

# '"device=0"' refers to the specific GPU(s) to run DeepCell-tf on, and is not required
docker run --gpus '"device=0"' -it \
-p 8888:8888 \
$USER/deepcell-tf:latest-gpu

It can also be helpful to mount the local copy of the repository and the notebooks to speed up local development. However, if you are going to mount a local version of the repository, you must first run the docker image without the local repository mounted so that the C extensions can be compiled and then copied over to your local version.

# First run the docker image without mounting externally
docker run --gpus '"device=0"' -it \
-p 8888:8888 \
$USER/deepcell-tf:latest-gpu

# Use ctrl-p, ctrl-q (or ctrl+p+q) to exit the running docker image without shutting it down

# Then, get the container_id corresponding to the running image of DeepCell-tf
container_id=$(docker ps -q --filter ancestor="$USER/deepcell-tf")

# Copy the compiled c extensions into your local version of the codebase:
docker cp "$container_id:/usr/local/lib/python3.6/dist-packages/deepcell/utils/compute_overlap.cpython-36m-x86_64-linux-gnu.so" deepcell/utils/compute_overlap.cpython-36m-x86_64-linux-gnu.so

# close the running docker
docker kill $container_id

# you can now start the docker image with the code mounted for easy editing
docker run --gpus '"device=0"' -it \
    -p 8888:8888 \
    -v $PWD/deepcell:/usr/local/lib/python3.6/dist-packages/deepcell/ \
    -v $PWD/notebooks:/notebooks \
    -v /$PWD:/data \
    $USER/deepcell-tf:latest-gpu

How to Cite

Copyright

Copyright © 2016-2021 The Van Valen Lab at the California Institute of Technology (Caltech), with support from the Shurl and Kay Curci Foundation, Google Research Cloud, the Paul Allen Family Foundation, & National Institutes of Health (NIH) under Grant U24CA224309-01. All rights reserved.

License

This software is licensed under a modified APACHE2. See LICENSE for full details.

Trademarks

All other trademarks referenced herein are the property of their respective owners.

Credits

Van Valen Lab, Caltech