workflows-notes

Publications

User interface solutions to design workflows:

Workflow engines for Bioinformatics

Repositories

Workflows collection

Others

Tutorials

Why using a workflow manager?

  • input data description
  • parameters for each individual tool
  • tool versioning tracking
  • generate execution reports with detailed information
  • Provide the execution environment (conda environment, singularities, dockers)
  • the software version of the workflow manager (which can also be provided in a container)
  • resource usage information (amount of memory, execution time, number of CPUs)
  • Automatic visualization of the pipeline steps
  • the workflow itself can be publicly archived and made citable by obtaining a version-specific digital object identifier (DOI) through Zenodo
  • Scalability: Adapt processing to the size of the data and the resources available
  • Re-entrancy: "Workflow managers can handle such events by enabling re-entrancy. Re-entrancy allows users to run a pipeline from its last successfully executed step, rather than from the beginning, in the case of a disruption."
  • incremental build

Non-DSL engines

"While graphical and DSL workflow managers are currently the most widely used frameworks for bioinformatics pipelines, there are some other types of workflow managers, such as programming-library-based tools. Programming-library-based workflow managers implement their pipeline management systems as a programming library for an existing, popular programming language." (ref)

"Workflow specifications provide a set of formalized rules for defining computational pipelines. This allows the separation of the pipeline definition from the execution environment, thereby adding another layer of abstraction. Workflow specifications enable the definition of pipelines that can be executed across workflow managers or execution environments that support the standard" (ref)

Companies proposing workflow services

Videos