In Starcraft, the Hyperion is a Behemoth-class battlecruiser. During the Second Great War, Raynor's Raiders made strategic decisions on the Hyperion's bridge -- the battlecruiser's command center.
Library and abstractions of AWS DataPipeline.
This project is migrated from https://github.com/krux/hyperion. Please refer to that repo for prior commit history.
This project aims to solve the following problem:
- Make it easy to define an AWS DataPipeline using a clear, fluent Scala DSL
Add the Sonatype.org Releases repo as a resolver in your build.sbt
or Build.scala
as appropriate.
resolvers += Resolver.sonatypeRepo("releases")
Add Krux Hyperion as a dependency in your build.sbt
or Build.scala
as appropriate.
libraryDependencies ++= Seq(
// Other dependencies ...
"com.krux" %% "hyperion" % "7.0.0"
)
This project is compiled, tested, and published for the following Scala versions:
- 2.12
- 2.13
Some pipeline steps need supporting scripts for execution. These scripts need to be uploaded to an S3 bucket where AWS Data Pipeline can access them.
Configure an S3 bucket and upload the scripts to that bucket with the following command:
$ ./deploy-scripts.sh s3://your-bucket/scripts
In your pipeline configuration be sure to set hyperion.script.uri = s3://your-bucket/scripts/
To create a new pipeline, create a Scala class in com.krux.datapipeline.pipelines
.
Look at ExampleSpark for an example pipeline.
To generate a JSON file describing the pipeline, ensure you have created the assembly:
$ sbt assembly
Then, run Krux Hyperion with the class name (specify the external jar location if it's not in the classpath):
$ ./hyperion [-jar your-jar-implementing-pipelines.jar] your.pipelines.ThePipeline generate > ThePipeline.json
Then you can go to the AWS Data Pipeline Management Console, click Create new pipeline and enter the class name for Name and click Import a definition and select Load local file. Finally, click Activate.
To create a pipeline automatically, ensure you have created the assembly:
$ sbt assembly
Then, run Krux Hyperion with create
and the class name:
$ ./hyperion [-jar your-jar-implementing-pipelines.jar] your.pipeline.ThePipeline create
This will use the DataPipeline API to create the pipeline and put the pipeline definition.
You can activate a pipeline either in the Data Pipeline Management Console, by using the --activate
option when using create
command or by using the activate
command.
$ ./hyperion activate df-1234567890
The Scaladoc API for this project can be found here.
Due to an AWS DataPipeline bug, all schemas involving data pipelines need to be available in the default search_path.
For more details: https://forums.aws.amazon.com/thread.jspa?threadID=166340