toruseo/UXsim

Speed up `route_search_all` by using priority queue optimized Dijkstra's

EwoutH opened this issue · 6 comments

After some benchmarks, I discovered the route_search_all takes up a majority (~80% of runtime) in my simulation. This feature request proposes optimizing the route search algorithm, specifically by replacing the current implementation with a more efficient algorithm suited for sparse graphs that are often encountered in real-world road networks.

Issue:
The Floyd-Warshall algorithm, while comprehensive in determining the shortest distances between all pairs of vertices, does not inherently provide the immediate next step in the shortest path from one node to another. The manual computation of this next matrix, especially in a graph of this scale and sparsity, is inefficient and time-consuming, as indicated by execution timings: next matrix computed in 19.861 sec.

UXsim/uxsim/uxsim.py

Lines 992 to 999 in 584b5d7

for i in range(n_vertices):
for j in range(n_vertices):
# iからjへの最短経路を逆にたどる... -> todo: 起終点を逆にした最短経路探索にすればよい
if i != j:
prev = j
while s.pred[i, prev] != i and s.pred[i, prev] != -9999:
prev = s.pred[i, prev]
s.next[i, j] = prev

Proposed Solution:
Adopt Dijkstra's algorithm, with a priority queue, for the computation of shortest paths. Dijkstra's algorithm is more suited to sparse graphs and inherently provides the immediate next node on the shortest path as part of its output, thus eliminating the need for the currently cumbersome post-processing step.

Technical Justification:

  • Efficiency in Sparse Graphs: Dijkstra's algorithm scales better with sparse graphs, operating at (O((V + E) \log V)) complexity, where (V) is the number of vertices and (E) is the number of edges. This is a significant improvement over the Floyd-Warshall’s (O(V^3)) complexity, making Dijkstra's algorithm particularly suitable for our graph's sparse nature.
  • Direct Path Reconstruction: Unlike Floyd-Warshall, Dijkstra’s algorithm directly provides the next node in the shortest path from the source to every other node. This feature simplifies the routing module by eliminating the need for additional calculations to determine the next steps in paths.
  • Optimization Potential: Shifting to Dijkstra's algorithm with a priority queue not only addresses the current bottleneck but also opens up opportunities for further optimizations, including potentially leveraging parallel computations for running Dijkstra's from multiple sources simultaneously.

Thanks. Yes, I completely agree that the current implementation is very awkward and inefficient. I will consider this solution later.

One thing that might be intereresting, is that while the weights (travel times) are different continiously, the graph connections themselves are static (we can assume that, right?). Therefore only the edge weights need to be updated, which again might allow to simplify the route search.

I assume the graph connections are static.

I updated this route_search_all function. It now uses numpy/scipy functions instead of the redundant for-loops. In a medium scale scenario, the computation time of route_search_all function was reduced from 1 sec to almost zero.

The solution is extremely simple. Just transposed the matrices twice. Sorry for being late.

Thanks a lot, actually this is exactly in time for some of my research!

Will post some benchmark numbers on my own models in the coming days.

On this network:

 number of vehicles:	 107325 veh
 total road length:	 2166278.9831619784 m
 time discret. width:	 25 s
 platoon size:		 25 veh
 number of timesteps:	 288
 number of platoons:	 4293
 number of links:	 3272
 number of nodes:	 1545

Performance with UXsim 1.3.0:

      time| # of vehicles| ave speed| computation time
       0 s|        0 vehs|   0.0 m/s|     0.00 s
     300 s|     8600 vehs|   6.1 m/s|    97.05 s
     600 s|    17150 vehs|   5.8 m/s|   171.31 s

Performance with UXsim 1.3.1:

      time| # of vehicles| ave speed| computation time
       0 s|        0 vehs|   0.0 m/s|     0.01 s
     300 s|     8625 vehs|   6.1 m/s|     7.93 s
     600 s|    17300 vehs|   5.7 m/s|    14.13 s

Over an 10x speedup in general runtime. That's insane. Really awesome!