schema mismatch in glob: column `"class"`
Closed this issue · 3 comments
Youssef-Harby commented
This issue appears when i am trying to download buildings for certain area in Cairo/Egypt with bbox using duckdb
, it's not a big deal for me because most/all of them null
data, but i have to mention it for upcoming releases.
root@overturemaps:~# ./duckdb
v0.9.2 3c695d7ba9
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D SET memory_limit = '32GB';
D SET threads TO 16;
D SET enable_progress_bar = true;
D SET enable_progress_bar_print = true;
D INSTALL httpfs;
D INSTALL spatial;
D LOAD httpfs;
D LOAD spatial;
D
D COPY (
> SELECT
> type,
> version,
> CAST(updatetime as varchar) as updateTime,
> height,
> numfloors as numFloors,
> level,
> class,
> JSON(names) as names,
> JSON(sources) as sources,
> ST_GeomFromWKB(geometry) as geometry
> FROM read_parquet('s3://overturemaps-us-west-2/release/2023-12-14-alpha.0/theme=buildings/type=*/*', hive_partitioning=1)
> WHERE
> bbox.minx > 31.26500
> AND bbox.maxx < 31.29643
> AND bbox.miny > 30.07066
> AND bbox.maxy < 30.10207
> ) TO 'egypt_cairo_hadaiq_el_qubbah_yharby_buildings.gpkg'
> WITH (FORMAT GDAL, DRIVER 'GPKG', SRS 'EPSG:4326');
100% ▕████████████████████████████████████████████████████████████▏
Error: IO Error: Failed to read file "s3://overturemaps-us-west-2/release/2023-12-14-alpha.0/theme=buildings/type=part/part-00000-a0ead583-abfd-4f33-969d-124a48bc3031-c000.zstd.parquet": schema mismatch in glob: column "class" was read from the original file "s3://overturemaps-us-west-2/release/2023-12-14-alpha.0/theme=buildings/type=building/part-00000-431912fa-aa4a-434d-9706-e2c921dffc76-c000.zstd.parquet", but could not be found in file "s3://overturemaps-us-west-2/release/2023-12-14-alpha.0/theme=buildings/type=part/part-00000-a0ead583-abfd-4f33-969d-124a48bc3031-c000.zstd.parquet".
Candidate names: id, geometry, bbox, names, version, updateTime, sources, height, numFloors, minHeight, facadeColor, facadeMaterial, roofMaterial, roofShape, roofDirection, roofOrientation, roofColor, eaveHeight, level, buildingId
If you are trying to read files with different schemas, try setting union_by_name=True
jwass commented
There is now a part
type partition under buildings which is what's causing the different schemas. So you probably want:
...
FROM read_parquet('s3://overturemaps-us-west-2/release/2023-12-14-alpha.0/theme=buildings/type=building/*
or if you want both parts and the footprints (type=building) then try setting union_by_name=True
in the read_parquet()
call as it suggested in the message.
jwass commented
@Youssef-Harby Were you able to get past the error you were seeing?
Youssef-Harby commented
Yes please close the issue, thank you @jwass