harsha2010/magellan

From spark dataframe to Polygon ?

laurikoobas opened this issue · 3 comments

How is that supposed to work? For Points there's the point($"x", $"y") way, but how to do the same thing with Polygons?

I have a spark dataframe that has an array of Points in one column and I'd like to turn that into a dataframe that has Polygons in a column that are based on those arrays of Points.

@laurikoobas you can create a user defined function (UDF) to do this... simply invoke Polygon(Array(0), points) where points is the array of points representing the polygon. We expect this array to be a loop, i.e. the starting and ending point in this array should be the same...
the UDF will look something like
val toPolygon = udf{(points: Array[Point]) => Polygon(Array(0), points)}

Right, that makes sense. And I apologize for keeping posting in this issue, but I don't know of a better avenue for asking for help on this.

I have this:

scala> p.printSchema
root
 |-- area_id: string (nullable = true)
 |-- line_index: integer (nullable = false)
 |-- points: array (nullable = true)
 |    |-- element: point (containsNull = true)

And do this:

val toPolygon = udf{(points: Array[Point]) => Polygon(Array(0), points)}
var a = p.select($"area_id", $"line_index", toPolygon($"points"))
a.show

And the result was this (after a few pages of stack trace):
scala.collection.mutable.WrappedArray$ofRef cannot be cast to [Lmagellan.Point

I figured it out after a while. The UDF should be this:
val toPolygon = udf{(points: Seq[Point]) => Polygon(Array(0), points.toArray)}