emmalanguage/emma

Add conversion methods to backend-specific distributed collections.

aalexandrov opened this issue · 1 comments

I suggest the following design.

Extend the DataBag trait with the following method:

def as[Coll[_]](implicity converter: CollConverter[Coll]): Coll[A] =
  converter(this)

Define a type class CollConverter[Coll] in org.emmalanguage.api in emma-language to model support from DataBag to a distributed collection Coll.

Define type class instances in in org.emmalanguage.api in emma-spark and emma-flink to realize the conversion. Since the signature of the apply method will be

def apply[A: Meta](bag: DataBag[A]): Coll[A]

the implementations can do dynamic casting with pattern matching and throw an error if the input collection is not supported (e.g. if one attempts to do xs.as[DataSet] on a Spark-based DataBag).

I took a stab on that in #298.