/APE

APE (Automated Pipeline Explorer)

Primary LanguageJavaApache License 2.0Apache-2.0

logo

APE (Automated Pipeline Explorer)

Badges
Fairness fair-software.eu OpenSSF Best Practices
Packages and Releases Latest release Maven Central
Build Status Build CodeQL Analysis
Documentation Documentation Status
DOI DOI
License GitHub license

APE is a command line tool and Java API for the automated exploration of possible computational pipelines (scientific workflows) from large collections of computational tools. Generated workflows can be exported in CWL format, as well as in graphical (PNG, SVG) formats.

APE relies on a semantic domain model that includes tool and type taxonomies as controlled vocabularies for the description of computational tools, and functional tool annotations (inputs, outputs, operations performed) using terms from these taxonomies. Based on this domain model and a specification of the available workflow inputs, the intended workflow outputs and possibly additional constraints, APE then computes possible workflows.

Internally, APE uses a component-based program synthesis approach. It translates the domain knowledge and workflow specification into logical formulas that are then fed to a SAT solver to compute satisfying instances. These solutions are then translated into the actual candidate workflows. For a detailed description, we refer to [1].

For our paper at ICCS 2020 [2] we created a video that explains APE in 5 minutes:

APE - Youtube video

For detailed information please visit our page.

Requirements

To run APE locally you need to have Java 1.8 (or higher) installed on your system (use the command $ java -version to check your local version).

To build APE from source, Maven 3.3+ has to be installed as well (use the command $ mvn -version to check your local version).

Note: Building APE from source is not required to run it, as the latest stable version is available at maven repository.

Releases

Add APE to your Maven project

To add a dependency on APE using Maven, use the following:

<!-- https://mvnrepository.com/artifact/io.github.sanctuuary/APE -->
<dependency>
    <groupId>io.github.sanctuuary</groupId>
    <artifactId>APE</artifactId>
    <version>2.x.x</version>
</dependency>

For information regarding Gradle, Ivy, etc. we refer to the APE mvn repository.

Manually download releases

Date Version Download
15-07-2020 1.0.1 jar, executable, javadoc, sources
02-05-2021 1.1.7 jar, executable, javadoc, sources
20-12-2021 1.1.12 jar, executable, javadoc, sources
17-05-2022 2.0.0 jar, executable, javadoc, sources
19-02-2024 2.3.0 jar, executable, javadoc, sources

Build APE from source (using Maven)

From the project root, simply launch

$ mvn -DskipTests=true install

to build the APE modules from the source tree and the built files will be generated under the /target directory. All the dependencies will be gathered by Maven and the following stand-alone module will be generated: APE-[latest]-executable.jar

Using APE

Automated workflow composition with APE can be performed through its command line interface (CLI) or its application programming interface (API). While the CLI provides a simple means to interact and experiment with the system, the API provides more flexibility and control over the synthesis process. It can be used to integrate APE’s functionality into other systems.

How to run APE from the command line

APE-[latest]-executable.jar is available in maven repository.

When running APE-[latest]-executable.jar from the command line, it requires a JSON configuration file given as a parameter and executes the automated workflow composition process accordingly:

java -jar APE-[latest]-executable.jar [path-to-ape-configuration]

The configuration file (see APE cofiguration example and APE configuration documentation) provides references to all therefor required information:

  1. Domain model - classification of the types and operations in the domain in form of an ontology (see ontology example in OWL) and a tool annotation file (see tool annotations example in JSON).
  2. Workflow specification - including a list of workflow inputs/outputs and template-based (see constraint templates) workflow constraints (see workflow constraints example)
  3. Parameters for the synthesis execution, such as the number of desired solutions, output directory, system configurations, etc. (see APE configuration documentation).

My first APE run

git clone git@github.com:sanctuuary/APE_UseCases.git

or

git clone https://github.com/sanctuuary/APE_UseCases.git

Download the latest version of APE-[latest]-executable.jar and add it to the APE_UseCases directory (~/git/APE_UseCases)

cd ~/git/APE_UseCases
java -jar APE-[latest]-executable.jar ImageMagick/Example1/config.json

See ImageMagick: Example 1 for more information about the results and on how to execute the composed workflow.

More examples

Use cases contains all the details and examples regarding the composition setup and the composition execution using the existing use cases (such as composition of ImageMagick operations).

How to use the APE API

Like the CLI, the APE API relies on a configuration file that references the domain ontology, tool annotations, workflow specification and execution parameters:

// set up the framework
APE ape = new APE("path/to/setup-configuration.json");

// run the synthesis
SATsolutionsList solutions = ape.runSynthesis("path/to/run-configuration.json");
// write the solutions for the file system
APE.writeSolutionToFile(solutions);
APE.writeDataFlowGraphs(solutions, RankDir.TOP_TO_BOTTOM);
APE.writeExecutableWorkflows(solutions);

However, the API allows to generate and edit the configuration file programmatically:

// set up the framework
APECoreConfig coreConfig = new APECoreConfig(...);
APE ape = new APE(coreConfig);

// run the synthesis
APERunConfig runConfig = APERunConfig.builder().withSolutionMinLength(1).withSolutionMaxLength(10)
                                                .withMaxNoSolutions(100).withApeDomainSetup(ape.getDomainSetup())
                                                .build();
SATsolutionsList solutions1 = ape.runSynthesis(runConfig);

// run the synthesis again with altered parameters
runConfig.setUseWorkflowInput(ConfigEnum.ONE);
SATsolutionsList solutions2 = ape.runSynthesis(runConfig);

For more information see APE javadoc.io page.

APE v2 architecture

The architecture of the APE v2 library is presented in the following figure. Components coloured light blue extend existing components in the APE v1 framework; dark blue components are new modules.

APE 2.0 Architecture

APE Web

Graphical Web Interface for the APE library is available at APE Web.

Use Cases and Demos

Our use cases are motivated by practical problems in various domains (e.g. bioinformatics, GIS [3]). Different examples are available at the APE Use Cases Repository.

The APE team

  • Vedran Kasalica (v.kasalica[at]esciencecenter.nl), lead developer
  • Maurin Voshol, student developer
  • Koen Haverkort, student developer
  • Anna-Lena Lamprecht (anna-lena.lamprecht[at]uni-potsdam.de), project initiator and principal investigator

Contact

For any questions concerning APE please get in touch with Vedran Kasalica (v.kasalica[at]esciencecenter.nl.nl).

Contributions

We welcome all contributions (bug reports, bug fixes, feature requests, extensions, use cases, etc.) to APE. Please get in touch with Vedran Kasalica (v.kasalica[at]esciencecenter.nl.nl) to coordinate your contribution. We expect all contributors to follow our Code of Conduct.

In case you have a specific request, want to report a bug or suggest a new constraint template please make an issue here.

Credits

APE has been inspired by the Loose Programming framework PROPHETS. It uses similar mechanisms for semantic domain modelling, workflow specification and synthesis, but strives to provide the automated exploration and composition functionality independent from a concrete workflow system.

We thank our brave first-generation users for their patience and constructive feedback that helped us to get APE into shape.

License

APE is licensed under the Apache 2.0 license.

Maven dependencies

  1. OWL API - LGPL or Apache 2.0
  2. SAT4J - EPL or GNu LGPL
  3. apache-common-lang - Apache 2.0
  4. graphviz-java - Apache 2.0
  5. org.json - JSON license

References

[1] Kasalica, V., & Lamprecht, A.-L. (2020). Workflow Discovery with Semantic Constraints: The SAT-Based Implementation of APE. Electronic Communications of the EASST, 78(0). https://doi.org/10.14279/tuj.eceasst.78.1092

[2] Kasalica V., Lamprecht AL. (2020) APE: A Command-Line Tool and API for Automated Workflow Composition. ICCS 2020. ICCS 2020. Lecture Notes in Computer Science, vol 12143. Springer, https://doi.org/10.1007/978-3-030-50436-6_34

[3] Kasalica, V., & Lamprecht, A.-L. (2019). Workflow discovery through semantic constraints: A geovisualization case study. In Computational science and its applications – ICCSA 2019 (pp. 473–488), Springer International Publishing, https://doi.org/10.1007/978-3-030-24302-9_53