rnckp/starter-code_opendataswiss

Baumkataster dataset is not properly downloaded

schorschie opened this issue · 5 comments

Out of curiosity I wanted to explore the Baumkataster dataset (5b1d787e-b8cc-4a94-bdd9-8d834fb9c087.ipynb) and noticed, that it cannot be directly downloaded from https://www.stadt-zuerich.ch/geodaten/download/Baumkataster?format=10008, since the link does not point to a CSV file directly.

It is an html side, where some additional buttons have to be clicked and the csv file is generated afterwards.

Is there maybe a different API at stadt-zuerich.ch?

I found, that there is a GET API which returns a JSON file: https://www.ogd.stadt-zuerich.ch/wfs/geoportal/Baumkataster?service=WFS&version=1.1.0&request=GetFeature&outputFormat=GeoJSON&typename=baumkataster_baumstandorte

Wouldn't that be a better source for some of the data?

rnckp commented

Hi! Thanks for pointing this out! 👍

The script that generates the starter code notebooks reads the metadata JSON from opendata.swiss and parses the links to the ressource(s) of each data set. This is the «ground truth» from the publisher that I adhere too.

Like you I also found a couple of data set ressources that unfortunately do not allow for a direct download of the CSV files but rather lead to a website like in your example.

At the moment I do not see any automated way to handle this. Just from the link or metadata itself I cannot see reliably which links lead to websites and which to CSV files directly. I also want to avoid a list of exceptions and manual intervention because this would be too time consuming.

If you see any way to solve this (without manual intervention) let me know. I'd be very interested.

Again – thanks for taking the time to create the issue.

What do you think about the second link? Do you see any chance to auto generate that?

rnckp commented

Unfortunately this link is not included in the metadata of the ressource. Have a look here.

I only can use what I get from this endpoint. From this data I retrieve the links to the individual metadata of each data set.