This project uses
networkx
to estimate drive times from source nodes to destination nodes on the UK road network.
You may install this project directly with pip (or similar) using:
pip install git+https://github.com/cjber/ukroutes
Alternatively you can install the project locally, using the following steps:
-
Clone the repository:
git clone https://github.com/cjber/ukroutes.git cd ukroutes
-
Set up a Python virtual environment:
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install the required dependencies:
pip install -r requirements.lock
This project requires several data files that cannot be redistributed.
The Python script scripts/demo.py
gives a simple overview of this library;
import pandas as pd
from tqdm import tqdm
from ukroutes.common.utils import Paths
from ukroutes.routing import Route
def process_dentists():
dentists_eng = pd.read_csv(Paths.RAW / "dentists_england.csv")
dentists_scot = pd.read_csv(Paths.RAW / "dentists_scotland.csv")
postcodes = pd.read_parquet(Paths.PROCESSED / "onspd" / "postcodes.parquet")
dentists_eng["postcode"] = dentists_eng["postcode"].str.replace(" ", "")
dentists_scot["postcode"] = dentists_scot["postcode"].str.replace(" ", "")
dentists = pd.concat([dentists_eng, dentists_scot])
dentists = dentists.merge(postcodes, on="postcode")
dentists.drop(columns="postcode").to_parquet(Paths.PROCESSED / "dentists.parquet")
process_dentists()
postcodes = pd.read_parquet(Paths.PROCESSED / "postcodes.parquet")
nodes = pd.read_parquet(Paths.PROCESSED / "oproad" / "nodes.parquet")
edges = pd.read_parquet(Paths.PROCESSED / "oproad" / "edges.parquet")
pq_files = list(Paths.PROCESSED.glob("*.parquet"))
for file in tqdm(pq_files):
source = pd.read_parquet(file).dropna(subset=["easting", "northing"])
route = Route(source=source, target=postcodes, nodes=nodes, edges=edges)
distances = route.route()
distances.to_parquet(Paths.OUT / f"{file.stem}_distances.parquet")
The primary goal of this project is to determine the distance of points of interest to each postcode within Great Britain. Given there are over 1.7 million postcodes, instead of routing from each postcode to each point of interest, the processing is inverted, routing from points of interest to all nodes in a graph, these nodes are then filtered to find postcodes. The following gives an overview of the sequential processing involved to achieve this.
- Process the OS Open Road Network
Ordnance Survey publish road speed estimates alongside their road network documentation. These estimates are used to provide average speed estimates and subsequent drive-time estimates using the length of linestring
geometries. For example the road speed estimate for all motorways is 67mph, while for single carriageway A and B roads the estimate is 25mph. These speeds are converted to drive-time in minutes using the road length.
OS Open Roads does not include ferry routes. These were therefore taken from OpenStreetMap (OSM), using the Overpass API (http://overpass-turbo.eu) with the query found here. KDTree
from scipy.spatial
is then used to determine the nearest road node point to the start and end location of these routes, allowing for them to be added directly to the road network. The speed estimate for these routes is 25mph, around the speed of an average ferry.
Despite the addition of ferry routes connecting isolated road networks on islands to the mainland, there were still road nodes that did not connect directly to the road network. These did not appear to follow any pattern; distributed evenly across GB. These were therefore removed after being identified using the nx.connected_components()
function.
- Add Postcodes and POIs to the road network
The add_to_graph
method creates new nodes at the location of a collection of easting and northing coordinates. These nodes are then added to the road network by generating a new edge between this point and the nearest k
road nodes using a KDTree
, with a speed estimate of 25mph.
- Routing from POIs to postcodes
While the interest is in determining the distance from postcodes to POIs, the previous processing allows for a large speed-up by considering the reverse of this task. The Route
class in routing.py
primarily routes using the Multi Source Shortest Path nx.multi_source_dijkstra
algorithm, which allows for weighted routing from points of interest to all other nodes in a graph. This approach means that for each node associated with a postcode, the minimum returned distance indicates the nearest POI by drive-time.