/proTES

Proxy service for injecting middleware into GA4GH TES requests

Primary LanguagePythonApache License 2.0Apache-2.0

proTES

license chat ci

Synopsis

proTES is a robust and scalable Global Alliance for Genomics and Health (GA4GH) Task Execution Service (TES) API gateway that may play a pivotal role in augmenting the capabilities of your GA4GH Cloud ecosystem by offering flexible middleware injection for effectively federating atomic, containerized workloads across on premise, hybrid and multi-cloud environments composed of GA4GH TES nodes.

Description

proTES gateway may serve as a crucial component in federated compute networks based on the GA4GH Cloud ecosystem. Its primary purpose is to provide centralized features to a federated network of independently operated GA4GH TES instances. As such, it can serve, for example, as a compatibility layer, a load balancer workload distribution layer, a public entry point to an enclave of independent compute nodes, or a means of collecting telemetry.

When TES requests are received, proTES applies a configured middlewares before forwarding the requests to appropriate TES instances in the network. A plugin system makes it easy to write and inject middlewares tailored to specific requirements, such as for access control, request/response processing or validation, or the selection of suitable endpoints considering data use restrictions and client preferences.

Built-in middleware plugins

Currently, there are two plugins shipped with proTES that each serve as proof-of-concept examples for different task distribution scenarios:

  • Load balancing: The pro_tes.middleware.task_distribution.random plugin evenly (actually: randomly!) distributes workloads across a network of TES endpoints
  • Bringing compute to the data: The pro_tes.middleware.task_distribution.distance plugin selects TES endpoints to relay incoming requests to in such a way that the distance the (input) data of a task has to travel across the network of TES endpoints is minimized.

Implementation notes

proTES is a Flask microservice that supports [OAuth2][res-oauth2]-based authorization out of the box (bearer authentication) and stores information about incoming and outgoing tasks in a NoSQL database ([MongoDB][res-mongodb]). Based on our FOCA microservice archetype, it is highly configurable in a declarative (YAML-based!) manner. Forwarded tasks are tracked asynchronously via a RabbitMQ broker and Celery workers that can be easily scaled up. Both a Helm chart and a Docker Compose configuration are provided for easy deployment in native cloud-based production and development environments, respectively.

proTES-overview

Installation

For production-grade Kubernetes-based deployment, see separate instructions. For testing/development purposes, you can use the instructions described below.

Requirements

Ensure you have the following software installed:

Note: These indicated versions are those that were used for developing/testing. Other versions may or may not work.

Prerequisites

Create data directory and required subdiretories

export PROTES_DATA_DIR=/path/to/data/directory
mkdir -p $PROTES_DATA_DIR/{db,specs}

Note: If the PROTES_DATA_DIR environment variable is not set, proTES will require the following default directories to be available:

  • ../data/pro_tes/db
  • ../data/pro_tes/specs

Clone repository

git clone https://github.com/elixir-europe/proTES.git

Traverse to app directory

cd proTES

Configure (optional)

The following user-configurable files are available:

Deploy

Build/pull and run services

docker-compose up -d --build

Visit Swagger UI

firefox http://localhost:8080/ga4gh/tes/v1/ui

Note: Host and port may differ if you have changed the configuration or use an HTTP server to reroute calls to a different host.

Contributing

This project is a community effort and lives off your contributions, be it in the form of bug reports, feature requests, discussions, ideas, fixes, or other code changes. Please read these guidelines if you want to contribute. And please mind the code of conduct for all interactions with the community.

Versioning

The project adopts the [semantic versioning][semver] scheme for versioning. Currently the service is in beta stage, so the API may change and even break without further notice. However, once we deem the service stable and "feature complete", the major, minor and patch version will shadow the supported TES version, with the build version representing proTES-internal updates.

License

This project is covered by the Apache License 2.0 also shipped with this repository.

Contact

proTES is part of ELIXIR Cloud & AAI, a multinational effort at establishing and implementing FAIR data sharing and promoting reproducible data analyses and responsible data handling in the life sciences.

If you have suggestions for or find issue with this app, please use the issue tracker. If you would like to reach out to us for anything else, you can join our Slack board, start a thread in our Q&A forum, or send us an email.

GA4GH logo ELIXIR logo ELIXIR Cloud & AAI logo