Outputting incorrect data
steveshi7 opened this issue · 2 comments
I’ve successfully gotten the library to output data after taking in a RDD of points and a RDD of polygons, but after manually testing the results of the join (contains, and intersect) operations, the results don’t actually seem to be accurate when plotted on a map.
The RDDs are of the format
(STObject(WKTstring), (arr(id).toLong, WKTstring))
. The Point RDD has 10,000 items, while the Polygon RDD has 500,000+. My join command is
polygonsRDDA.join(pointsRDDA, JoinPredicate.CONTAINS)
I'm fairly certain the format is correct, as are the WKTstrings, since I'm getting a valid
[(polygon_id, WKT)(point_id, WKT)]
output RDD, with substantial data.
Here is one row of the output:
[7968,POINT (77.2221885273425 28.5089347347766)]|
|[929587445047033467,POLYGON ((77.24398775026202 28.61936221830547, 77.24380536004901 28.61944234929979, 77.24360687658191 28.61956941895187, 77.24423987790942 28.620445327833295, 77.24442763254046 28.62033703364432, 77.24459392949939 28.620238127186894, 77.2441808693111 28.61965893767774, 77.24398775026202 28.61936221830547))]
Plugging into a WKT visualizer, you can observe that the polygon and point are in fact far away from eachother.
Any help would be appreciated!
Hi,
thanks for the report! I will look into this and provide a fix as soon as possible.
It turned out that there was a confusion in the output and not a bug in STARK. Closing.