Note: Limbo is in a WIP state.
Limbo is a Scala API which allows to leverage best of different data processing frameworks by allowing seamless transition between framework specific data structures.
- Itegration between Scio and Spark
- Programmatic Spark job submission to a Apache YARN cluster
- Scala API for Google Dataproc cluster
// Start in Scio:
val (sc, args) = ContextAndArgs(argv)
val scol = sc.parallelize(1 to 10)
// Move to Spark realm
scol.toRDD().map { rdd =>
rdd
.map(_ * 2)
.saveAsTextFile(args("output"))
}
This project adheres to the Open Code of Conduct. By participating, you are expected to honor this code.
Copyright 2016 Spotify AB.
Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0