/genomics-on-azure

Unofficial genomics resources for Azure

MIT LicenseMIT

Genomics on Azure

Cromwell

TODO

Microsoft Genomics

TODO

Snakemake

Snakemake can be run as-is on a big Azure VM or an HPC cluster deployment like Azure CycleCloud.

Commercial support

Snakemake is commerically supported through BizData. The service is called Genomics Pipeline Acceleration OnDemand and runs on Azure Batch Shipyard. It requires only minimal modifcation of your Snakefile. This was featured as a Microsoft Customer story.

Azure Kubernetes Services (AKS)

Snakemake can be run on AKS without the need for a shared filesystem (persistent volume). Data for each step is staged automatically in and out from Blob storage, which makes this cheap and resilient.

This hasn't been tested yet at scale or with low-prio VMs.

References:

Azure Batch

Azure Batch support is work in progress. At the time of writing use of spot instances, job placement and autoscaling all needed testing or implementation.

References:

Nextflow

Nextflow can be run as-is n a big Azure VM or an HPC cluster deployment like Azure CycleCloud.

Commercial support

Nextflow is supported through SeqeraLabs, which was founded by its authors. However Azure support is not an existing offer.

Azure Kubernetes Services (AKS)

Nextflow can run on AKS using Nextflow's Kubernetes executor (kuberun). See this blog post for more info. This uses a persistent (Azure Files) volume as shared filesystem. Note, this is based on an older version of Nextflow, so some details might have changed in the meantime.

Azure Batch

Nextflow currently doesn't support Azure Batch. The reason was a missing feature in the Azure Java SDK (NIO FileSystemProvider), which has now (July 2020) been implemented. See references for more information and current status.

References:

Apache Ignite

Nextflow can be run on Azure using Apache Ignite with this deployment template.

TODO:STATUS?

References: