GIScience/ohsome-py

put OSM tags to dict-column

SlowMo24 opened this issue · 2 comments

elements/geometry: currently the geodataframe method to convert from geojson to gdf is used which will put each tag in a separate column. This may create a gdf with a large number of columns (as with many OSM transformations). Can we provide a method (e.g. a flag) that will return all tags in a dict-column named tags or similar?

here is some code how to achieve this. Possibly not very performant (especially on memory!!!)

data["tags"] = data.drop("geometry", axis=1).to_dict(orient="records")
data["tags"] = data.tags.apply(lambda d: {k: v for k, v in d.items() if not pd.isna(v)})
data = data.drop(data.columns.difference(["geometry", "tags"]), axis=1)

We could also give the option to extract "primary-keys" as columns. Let's also have a look at the gdal config that allows some config for OSM transform.