Test code: how to download data for a bounding box
cbeddow opened this issue · 7 comments
Please take a look at the below Python code and ask any questions or make suggestions to improve the code. This is not our real code, but we will want to make a function that appears like:
get_data(bbox,type,filter)
In this case bbox is the geographic bounding area, type is the API endpoint (points, traffic signs, images, sequences), and filter would likely want to use a format with an operator (>=, ==, <, contains, etc) for the properties of the returned data, for example a map feature in the example code matching the value for a street light, but we could filter also by date > than a given input.
We also want to consider a function that would save the data to file, maybe like saveGeoJSON(path)
.
import mercantile, mapbox_vector_tile, requests, json
from vt2geojson.tools import vt_bytes_to_geojson
# define an empty geojson as output
output= { "type": "FeatureCollection", "features": [] }
# vector tile endpoints -- change this in the API request to reference the correct endpoint
tile_points = 'mly_map_feature_point'
tile_traffic_signs = 'mly_map_feature_traffic_sign'
# Mapillary access token -- user should provide their own
access_token = 'MLY|XXX'
# a bounding box in [east_lng,_south_lat,west_lng,north_lat] format
east, south, west, north = [-80.13423442840576,25.77376933762778,-80.1264238357544,25.788608487732198]
# list of values to filter for and keep -- update this if changing to traffic signs
filter_values = ['object--support--utility-pole','object--street-light']
# get the tiles with x and y coordinates which intersect our bounding box, MUST be at zoom level 14 where the data is available
tiles = list(mercantile.tiles(east, south, west, north, 14))
# loop through all tiles to get IDs of Mapillary data
for tile in tiles:
tile_url = 'https://tiles.mapillary.com/maps/vtp/{}/2/{}/{}/{}?access_token={}'.format(tile_points,tile.z,tile.x,tile.y,access_token)
response = requests.get(tile_url)
data = vt_bytes_to_geojson(response.content, tile.x, tile.y, tile.z)
# filter for only features matching the values in filter list above
filtered_data = [feature for feature in data['features'] if feature['properties']['value'] in filter_values]
# check if features are inside bbox, and push to output geojson object if yes
for feature in filtered_data:
if (feature['geometry']['coordinates'][0] > east and feature['geometry']['coordinates'][0] < west)\
and (feature['geometry']['coordinates'][1] > south and feature['geometry']['coordinates'][1] < north):
output['features'].append(feature)
# save a local geojson with the filtered data
with open('mydata.geojson', 'w') as f:
json.dump(output, f)
Hey @cbeddow! I have a couple of questions:
- Could you please elaborate on the
filter
parameter of theget_data
function? And can you please explain the filtering criteria in the following snippet?
# list of values to filter for and keep -- update this if changing to traffic signs
filter_values = ['object--support--utility-pole','object--street-light']
- What does zoom level mean? and why does it have to be 14?
- If we already get the bounded tiles here
# get the tiles with x and y coordinates which intersect our bounding box, MUST be at zoom level 14 where the data is available
tiles = list(mercantile.tiles(east, south, west, north, 14))
Why do we need to check if the features exist within the bbox again here?
# check if features are inside bbox, and push to output geojson object if yes
for feature in filtered_data:
if (feature['geometry']['coordinates'][0] > east and feature['geometry']['coordinates'][0] < west)\
and (feature['geometry']['coordinates'][1] > south and feature['geometry']['coordinates'][1] < north):
output['features'].append(feature)
-
in a
get_data
function, a request will be made to the Mapillary API seeking all data within a specified boundary. If this is sent to the map feature tile endpoint, then the data will include the following: street lamps, utility poles, crosswalks, trash bins, shop signs, fire hydrants. Others will be included also. These each have a technical tag name such asobject--street-light
. I will get a list of these for us. So the use case here is a user withfilter_values = ['object--support--utility-pole','object--street-light']
in a list, could pass this to the function, something likeget_data(filter=filter_values)
and return only utility poles and street lights, rather than returning all classes of data if left unspecified. -
Zoom level refers to the zoom level of a tile service. I recommend watching this video to take a look: https://www.youtube.com/watch?v=savQWL0kq_g
So there are 22 zoom levels on a web map. Zoom level=1 (or maybe 0) is a world map scale, which zoom level 14 is roughly a city-level. Check out this link: https://mapsam.com/map/#14/41.0016/29.0348
In this URL, the format is map/#ZOOM/LONGITUDE/LATITUDE. When you load it, the zoom is at level 14. See the red squares on the map, each of these is a level 14 tile. If you zoom in or out it changes, so test how it looks with a level 5 tile, and a level 18 tile.
When we access the raw data of the tile for this Python package, it will always be at level 14 (or the server may reject the call and say no data exists). So If we want to download all data for Istanbul city, we need to probably download data from multiple tiles, each at level 14 size.
For example, we can download the geojson here, which has a polygon representing Istanbul: https://spelunker.whosonfirst.org/id/85679237/
Or we can use the bounding box of Istanbul: [27.972623,40.801277,29.915492,41.585109]
Either of this should be given to the function, so we can say:
`get_data(location=[27.972623,40.801277,29.915492,41.585109],filter=['object--street-light'])`
Then we would have a returned function that is a geojson point data format.
- In the first function,
mercantile.tiles()
will get all the zoom 14 tiles that intersect the input bounding box. So if our bounding box covers Istanbul, some of the zoom 14 tiles returned may only partially intersect our bounding box. So when we do the check to see if the features exist, we want to eliminate the features that are part of a tile which intersects the box, but where the actual feature lies outside of the box.
For example, in this image if the red is out bounding box, then the tiles with color are the ones which intersect it. We only want to keep point features that fall inside the yellow area, and not the orange area despite those tiles intersecting the bbox, so this second check would filter the partially intersecting tile features.
Acknowledged and read. I watched the lecture you mentioned above and got through that as well. For now, all this makes sense to me, but I'll definitely ask if I have more questions. Thank you!
@cbeddow All clear now. Thank you for the thorough answers!
@OmarMuhammedAli should I close this?
@OmarMuhammedAli should I close this?
Closing.