/spring-cloud-dataflow

Spring Cloud Data Flow provides orchestration for data microservices, including both stream and task processing

Primary LanguageJavaApache License 2.0Apache-2.0

Spring Cloud Data Flow Build Status Stories Ready Stories In Progress

The Spring Cloud Data Flow project provides orchestration for data microservices, including long lived stream applications and short lived task applications.

Components

The Core domain module includes the concept of a stream that is a composition of spring-cloud-stream modules in a linear pipeline from a source to a sink, optionally including processor module(s) in between. The domain also includes the concept of a task, which may be any process that does not run indefinitely, including Spring Batch jobs.

The App Registry maintains the set of available apps, and their mappings to URIs. For example, if relying on Maven coordinates, an app’s URI would be of the format: maven://<groupId>:<artifactId>:<version>

The Data Flow Server is a Spring Boot application that provides a common REST API and UI. For each runtime environment there is a different version of the Data Flow Server that depends upon a deployer SPI implementation for that environment. The github locations for these Data Flow Servers are:

The deployer SPI mentioned above is defined within the Spring Cloud Deployer project. That provides an abstraction layer for deploying the apps of a given stream or task and managing their lifecycle. The github locations for the corresponding Spring Cloud Deployer SPI implementations are:

The Shell connects to the Data Flow Server’s REST API and supports a DSL that simplifies the process of defining a stream or task and managing its lifecycle.

Instructions for running the Data Flow Server for each runtime environment can be found in their respective github repositories.

Contributing

We love contributions. Follow this link for more information on how to contribute.

Building

Clone the repo and type

$ ./mvnw clean install

If you are running behind a proxy, then you need to specify the proxy settings via environment properties like this: (choose the ones that are appropriate for your settings)

export MAVEN_LOCAL_REPOSITORY=mylocalMavenRepo
export MAVEN_REMOTE_REPOSITORIES=repo1,repo2
export MAVEN_OFFLINE=true
export MAVEN_PROXY_PROTOCOL=https
export MAVEN_PROXY_HOST=host1
export MAVEN_PROXY_PORT=8090
export MAVEN_PROXY_NON_PROXY_HOSTS='host2|host3'
export MAVEN_PROXY_AUTH_USERNAME=user1
export MAVEN_PROXY_AUTH_PASSWORD=passwd

For more information on building, see this link.