Issues Using Indexed Columns While Doing a Spatial Join
mdbuck opened this issue · 3 comments
mdbuck commented
Attached is a driver application demonstrating how something bad happens when trying to do a spatial join between a point DataFrame and a polygon DataFrame using the new indexing feature in Magellan 1.0.5:
- a NullPointerException may be thrown;
- an OutOfMemory error may be thrown;
- the JVM may crash;
- the application may finish fine but contains.show() displays gibberish to the console.
harsha2010 commented
how many nodes are you using? a single driver and no workers? how big is your driver node? and how much data are we talking about? (polygons and points)
mdbuck commented
The driver application is a command line application that starts up Spark with spark.master == local[1]
The data is small: the polygon table contains 5 rows with the largest polygon containing 9 nodes; the point table contains 8 rows.
I have simplified the driver application. Please see attached.
mdbuck commented
Any more news on this?
Thanks.