This repository includes instructions and sample queries to access Overture Maps Data.
We also welcome feedback about Overture Maps data in the Discussions. Feedback on the data schema, is best provided in the discussions in the schema repository.
Overture Maps data is available in cloud-native Parquet format.
There is no single Overture "entire planet" file to be downloaded. Instead, we
have organized the data for the Overture 2023-10-19-alpha.0
release by theme and type at the following locations:
Theme | Location |
---|---|
Admins |
|
Buildings |
|
Places |
|
Transportation |
|
Base |
|
The Parquet files match the Overture Data Schema for each theme with the following enhancements:
- The
id
column contains temporary IDs that are not yet part of the Global Entity Reference System (GERS). These IDs are not yet stable and are likely to change significantly up to the point that GERS is released. - The
bbox
column is astruct
with the following attributes:minX
,maxX
,minY
,maxY
. This column allows you to craft more efficient spatial queries when running SQL against the cloud. - The
geometry
column is encoded as WKB.
You can access Overture Parquet data files directly from the cloud, or copy them to your preferred destination, or download them locally. We do encourage you to fetch the data directly from the cloud using one of the SQL query options documented below.
- You will need an AWS account.
- Ensure that you are operating in the us-west-2 region.
- In the Amazon Athena console on AWS:
- Run
CREATE EXTERNAL TABLE
queries to set up your view of the tables: click for queries. - Be sure to load the partitions by running
MSCK REPAIR <tablename>;
or choosing "Load Partitions" from the table options menu.
- Run
Example Athena SQL query to download a CSV of places in Seattle:
SELECT
CAST(names AS JSON),
geometry -- WKB
FROM
places
WHERE
bbox.minX > -122.4447744
AND bbox.maxX < -122.2477071
AND bbox.minY > 47.5621587
AND bbox.maxY < 47.7120663
More information on using Athena is available in the Amazon Athena User Guide.
- You will need an Azure account.
- Create a Synapse workspace.
Example SQL query to read places in Seattle:
SELECT TOP 10 *
FROM
OPENROWSET(
BULK 'https://overturemapswestus2.blob.core.windows.net/release/2023-10-19-alpha.0/theme=places/type=place/',
FORMAT = 'PARQUET'
)
WITH
(
names VARCHAR(MAX),
categories VARCHAR(MAX),
websites VARCHAR(MAX),
phones VARCHAR(MAX),
bbox VARCHAR(200),
geometry VARBINARY(MAX)
)
AS
[result]
WHERE
TRY_CONVERT(FLOAT, JSON_VALUE(bbox, '$.minx')) > -122.4447744
AND TRY_CONVERT(FLOAT, JSON_VALUE(bbox, '$.maxx')) < -122.2477071
AND TRY_CONVERT(FLOAT, JSON_VALUE(bbox, '$.miny')) > 47.5621587
AND TRY_CONVERT(FLOAT, JSON_VALUE(bbox, '$.maxy')) < 47.7120663
More information is available at Query files using a serverless SQL pool - Training | Microsoft Learn.
DuckDB is an analytics tool you can install locally that can efficiently query remote Parquet files using SQL. It will only download the subset of files it needs to fulfil your queries.
Ensure you are using DuckDB >= 0.9.1 to support the SRS
parameter.
If, for example, you wanted to download the administrative boundaries
for all adminLevel=2
features, you could run:
COPY (
SELECT
type,
subType,
localityType,
adminLevel,
isoCountryCodeAlpha2,
JSON(names) AS names,
JSON(sources) AS sources,
ST_GeomFromWkb(geometry) AS geometry
FROM read_parquet('s3://overturemaps-us-west-2/release/2023-10-19-alpha.0/theme=admins/type=*/*', filename=true, hive_partitioning=1)
WHERE adminLevel = 2
AND ST_GeometryType(ST_GeomFromWkb(geometry)) IN ('POLYGON','MULTIPOLYGON')
) TO 'countries.geojson'
WITH (FORMAT GDAL, DRIVER 'GeoJSON');
This will create a countries.geojson
file containing 265 country
polygons and multipolygons.
To make this query work in DuckDB, you may need a couple of one-time setup items to install the duckdb_spatial and httpfs extensions:
INSTALL spatial;
INSTALL httpfs;
And a couple of per-session items to load the extensions and tell DuckDB which S3 region to find Overture's data bucket in:
LOAD spatial;
LOAD httpfs;
SET s3_region='us-west-2';
To get the same query working against Azure blob storage, you need to install and load Azure extension, and set connection string.
INSTALL azure;
LOAD azure;
SET azure_storage_connection_string = 'DefaultEndpointsProtocol=https;AccountName=overturemapswestus2;AccountKey=;EndpointSuffix=core.windows.net';
Here is an example path to be passed to read_parquet
method: azure://release/2023-10-19-alpha.0/theme=admins/type=*/*
You can get a single-node Sedona Docker image from Apache Software Foundation DockerHub and run docker run -p 8888:8888 apache/sedona:latest
. A Jupyter Lab and notebook examples will be available at http://localhost:8888/. You can also install Sedona to Databricks, AWS EMR and Snowflake using Wherobots.
The following Python + Spatial SQL code reads the Places dataset and runs a spatial filter query on it.
from sedona.spark import *
config = SedonaContext.builder().config("fs.s3a.aws.credentials.provider", "org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider").getOrCreate()
sedona = SedonaContext.create(config)
df = sedona.read.format("parquet").load("s3a://overturemaps-us-west-2/release/2023-10-19-alpha.0/theme=places/type=place")
df.filter("ST_Contains(ST_GeomFromWKT('POLYGON((-122.48 47.43,-122.20 47.75,-121.92 47.37,-122.48 47.43))'), ST_GeomFromWKB(geometry)) = true").show()
For more examples from wherobots, check out their Overture-related Notebook examples.
You can download the Parquet files from either Azure Blob Storage or Amazon S3 at the locations given in the table at the top of the page.
After installing the AWS CLI,
you can download the files from S3 using the below command. Set <DESTINATION>
to a local directory path to
download the files, or to an s3://
path you control to copy them into your S3 bucket.
aws s3 cp --region us-west-2 --no-sign-request --recursive s3://overturemaps-us-west-2/release/2023-10-19-alpha.0/ <DESTINATION>
The total size of all of the files is a little over 200 GB.
You can download the files from Azure Blob Storage using
Azure Storage Explorer
or the AzCopy
command. An example azcopy
command is given below.
azcopy copy "https://overturemapswestus2.dfs.core.windows.net/release/2023-10-19-alpha.0/" "<<local directory path>>" --recursive```
We are very interested in feedback on the Overture data. Please use the Discussion section of this repo to comment. Tagging it with the relevant theme name (Places, Transportation) will help direct your ideas.
Category selection
- Click HERE to submit your feedback
- Select the layer discussion category
- Administration Boundaries
- Transportation
- Places
- Buildings
Discussion outline
- Add a title
- Outline your feedback with as much detail as possible
- Click [Start Discussion]
The associated Task Force will carefully review each submission and offer feedback where required.