Add conversion methods to backend-specific distributed collections.
aalexandrov opened this issue · 1 comments
aalexandrov commented
I suggest the following design.
Extend the DataBag
trait with the following method:
def as[Coll[_]](implicity converter: CollConverter[Coll]): Coll[A] =
converter(this)
Define a type class CollConverter[Coll]
in org.emmalanguage.api
in emma-language
to model support from DataBag
to a distributed collection Coll
.
Define type class instances in in org.emmalanguage.api
in emma-spark
and emma-flink
to realize the conversion. Since the signature of the apply
method will be
def apply[A: Meta](bag: DataBag[A]): Coll[A]
the implementations can do dynamic casting with pattern matching and throw an error if the input collection is not supported (e.g. if one attempts to do xs.as[DataSet]
on a Spark-based DataBag
).
aalexandrov commented
I took a stab on that in #298.