Increasing Performance with Convex Hull Approach
Opened this issue · 13 comments
Repository works great with basic polygons, but when vertices reach large numbers it takes forever to build the graph. Using convex hull approach will reduce the number of vertices dramatically and improve the performance. In some of my tests, performance improved by 5000%.
This may be a good approach to consider.
@bulutkartal as discussed by mail I would definitely look into this. Would be great if you could provide a dataset example of the graph you are aiming to build.
I used random shape file from NOAA. Then used DotSpatial to read the shapefile and DotSpatial.Topology to get Convex Hull point coordinates.
I guess the best way to reduce the workload;
-Create a direct line between start and end points,
-If line crosses the polygon; get convex hull points and create a new polygon out of this;
-Line close to end point may not cross the whole polygon; so we can clip the polygon and rebuild last polygon.
This will reduce the number of edges and vertices. So will increase the graph performance.
I am working on it by using external libraries; whenever I have a time I can implement this in c# and make a pull request :)
For Graph.cs
In the code lines starting with if (vertexCount > 2), you test the vertext count, but the count did not change since the last test if (vertexCount >= 3). Drop this second if. Then you create a new polygon with gPolygon gPol = new gPolygon();. This polygon gets replace immediately afterwards by the out parameter in polygons.TryGetValue(newId, out gPol). Either TryGetValue yields true, then gPol become the polygon found in the collection, or TryGetValue yields false and gPol becomes null. Do not assign gPol.
Since newId gets created before the for-loop, you can move the code finding or creating the polyon out of the loop
int vertexCount = vertices.Count();
if (vertexCount >= 3)
{
int newId = GetNextId();
if (!polygons.TryGetValue(newId, out Polygon gPol)) {
gPol = new Polygon { id = newId };
polygons.Add(newId, gPol);
}
for (var j = 0; j < vertexCount; j++)
{
int next_index = (j + 1) % vertexCount;
Vertex vertex = vertices[j];
Vertex next_vertex = vertices[next_index];
Edge edge = new Edge(vertex, next_vertex);
vertex.polygonId = newId;
next_vertex.polygonId = newId;
gPol.edges.Add(edge);
AddEdge(edge);
}
}
In AddEdge you are repeating the same error of overwriting the lists just created by TryGetValue.
The out parameter replaces the value of startEdgesList and endEdgesList either with an existing list or with null, so the two new List() are useless
I guess this will increase the performance.
We can do the same improvements on visibility graph as well to speed it up.
@bulutkartal thanks for looking at this. To be honest I didn't have the time to look at this just yet. In any case it should be an improvement, but it still looks at every vertex so it will still scale badly on huge datasets.
@bulutkartal it has been a long time but I've finally sat down and try to make this better. I checked the ConvexGraph method and looks promising. I've just implemented a basic ConvexHull method. Hopefully I can make a good progress!
@alvpickmans Can't wait to check the performance. Hopefully I can test it with 2M nodes next week :)
@alvpickmans Looks good but definitely needs some improvements. I saw that you already opened a Issue for Edge-Polygon intersection algorithm. The best in C# so far :)
yes, until now I've just been refactoring the project :)
I admit there's a lot going on at the office so little time, but yes I'm looking into the Edge-Polygon intersection atm. A naive method is easy enough but I'm on the look for a faster algorithm. Feel free to comment on the issue if you have any reference/suggestion!
So far I tested it with a geojson file. I downloaded the Mercator Large simplified polygons not split from OSM.
I picked one of the largest polygons which has 189,593 vertices :)
` public virtual List CreateVerticesList()
{
var vertexlist = new List();
var geojsonobj = new JObject();
string fileName = "WorldLandPolygons.geojson";
string path = Path.Combine(Environment.CurrentDirectory, @"wwwroot\files\", fileName);
using (StreamReader file = File.OpenText(path))
using (JsonTextReader reader = new JsonTextReader(file))
{
JObject o2 = (JObject)JToken.ReadFrom(reader);
geojsonobj = o2;
}
foreach (var x in geojsonobj.SelectTokens("features[*]"))
{
var fidpath = x.Path + ".properties.FID";
var fid = geojsonobj.SelectToken(fidpath).ToString();
var pathd = x.Last().Last().Last().Last().Last().First();
if(fid == "40674")
{
foreach (var y in pathd)
{
vertexlist.Add(Vertex.ByCoordinates(Convert.ToDouble(y.Last()), Convert.ToDouble(y.First())));
}
}
}
return vertexlist;
}`
then
`var vList = CreateVerticesList();
vList.RemoveAt(vList.Count() - 1);
var testpolygon = new Polygon(vList);
var testpolylist = new List
{
testpolygon
};
var fVGraph = VisibilityGraph.ByBaseGraph(new Graph(testpolylist));`
but on DistanceToIntersection method in EdgeKey.cs it returns NaaN and cause System.NullReferenceException in ArcRadAngle in Vertex Js.
So far creating List from json took 11s, which is not bad. Creating polygon 122ms. but visibility graph takes forever and couldn't complete the process bc of null exception.
Saving the graph into db, or json would be a great option too. This will save a lot time for runs after initial.