OvertureMaps/data

With DuckDB impossible to create geometry column from wkb

Closed this issue · 7 comments

In this example, the geometry column is created using ST_Geomfromwkb().

But this operation causes a crash, so I assume the problem is on the duckdb side, but I'm posting a ticket here anyway, because my problem comes from the query indicated.

With python :

import duckdb 

con = duckdb.connect("test1.db")
con.sql(" INSTALL spatial ; INSTALL httpfs  ; LOAD spatial ; LOAD httpfs ; ")

sql_query = ("SELECT id, "
"geometry "
"FROM read_parquet('s3://overturemaps-us-west-2/release/2024-04-16-beta.0/theme=buildings/type=*/*', "
"filename=true, "
"hive_partitioning=1) "
"LIMIT 5; ")

con.sql(sql_query).show()

No problem, we have a result: the geometry is a binary object. As indicated in the documentation, use ST_GeomFromWKB() to create the geometry column.

Résult :

image

import duckdb 

con = duckdb.connect("test2.db")
con.sql(" INSTALL spatial ; INSTALL httpfs  ; LOAD spatial ; LOAD httpfs ; ")

sql_query = ("SELECT id, "
"ST_GeomFromWKB(geometry) as geom "
"FROM read_parquet('s3://overturemaps-us-west-2/release/2024-04-16-beta.0/theme=buildings/type=*/*', "
"filename=true, "
"hive_partitioning=1) "
"LIMIT 5; ")

con.sql(sql_query).show()

Résult :

Segmentation fault (core dumped)

image


I have also open a ticket in duckdb_spatial : duckdb/duckdb_spatial#305

Hi @florentfgrs
What version of duckdb are you using?

This just worked for me on v0.10.2 on the cli

D select id, st_geomfromwkb(geometry) from read_parquet('s3://overturemaps-us-west-2/release/2024-04-16-beta.0/theme=buildings/type=*/*', filename=true, hive_partitioning=true) limit 5;
100% ▕████████████████████████████████████████████████████████████▏
┌──────────────────────┬─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│          id          │                                                                              st_geomfromwkb(geometry)                                                                               │
│       varchar        │                                                                                      geometry                                                                                       │
├──────────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ 08bf2a40219b0fff02…  │ POLYGON ((-167.3999539 -83.6500135, -167.3999133 -83.6500116, -167.3998804 -83.6500083, -167.3998582 -83.6500038, -167.3998506 -83.6499988, -167.3998584 -83.6499938, -167.399880…  │
│ 08bf35ad6a05afff02…  │ POLYGON ((-136.8028948 -74.7669439, -136.8030749 -74.7670273, -136.8020661 -74.7671777, -136.801886 -74.7670942, -136.8028948 -74.7669439))                                         │
│ 08bf35ad6a058fff02…  │ POLYGON ((-136.8033409 -74.7667304, -136.802516 -74.7668605, -136.8023483 -74.7667871, -136.8031732 -74.766657, -136.8033409 -74.7667304))                                          │
│ 08bf35ad6a04efff02…  │ POLYGON ((-136.8030898 -74.7660076, -136.8025614 -74.7659821, -136.8026186 -74.7659004, -136.8031471 -74.765926, -136.8030898 -74.7660076))                                         │
│ 08bf35ad6a04afff02…  │ POLYGON ((-136.8020881 -74.7661436, -136.8014847 -74.7660121, -136.8017343 -74.765933, -136.8023377 -74.7660645, -136.8020881 -74.7661436))                                         │
└──────────────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
D

Hi @jwass

I use 0.10.2

It also works in the cli for me, the
problem with python.

And even when I do a create table with python, then a select in the cli. I have a core dumped on the cli.

Interesting... it works for me in Python

>>> import sys
>>> sys.version
'3.9.6 (default, Feb  3 2024, 15:58:28) \n[Clang 15.0.0 (clang-1500.3.9.4)]'
>>> import duckdb
>>> con = duckdb.connect("test2.db")
>>> con.sql(" INSTALL spatial ; INSTALL httpfs  ; LOAD spatial ; LOAD httpfs ; ")
>>>
>>> sql_query = ("SELECT id, "
... "ST_GeomFromWKB(geometry) as geom "
... "FROM read_parquet('s3://overturemaps-us-west-2/release/2024-04-16-beta.0/theme=buildings/type=*/*', "
... "filename=true, "
... "hive_partitioning=1) "
... "LIMIT 5; ")
>>>
>>> con.sql(sql_query).show()
100% ▕████████████████████████████████████████████████████████████▏
┌──────────────────────┬──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│          id          │                                                               geom                                                               │
│       varchar        │                                                             geometry                                                             │
├──────────────────────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ 08bf2a40219b0fff02…  │ POLYGON ((-167.3999539 -83.6500135, -167.3999133 -83.6500116, -167.3998804 -83.6500083, -167.3998582 -83.6500038, -167.3998506…  │
│ 08bf35ad6a05afff02…  │ POLYGON ((-136.8028948 -74.7669439, -136.8030749 -74.7670273, -136.8020661 -74.7671777, -136.801886 -74.7670942, -136.8028948 …  │
│ 08bf35ad6a058fff02…  │ POLYGON ((-136.8033409 -74.7667304, -136.802516 -74.7668605, -136.8023483 -74.7667871, -136.8031732 -74.766657, -136.8033409 -…  │
│ 08bf35ad6a04efff02…  │ POLYGON ((-136.8030898 -74.7660076, -136.8025614 -74.7659821, -136.8026186 -74.7659004, -136.8031471 -74.765926, -136.8030898 …  │
│ 08bf35ad6a04afff02…  │ POLYGON ((-136.8020881 -74.7661436, -136.8014847 -74.7660121, -136.8017343 -74.765933, -136.8023377 -74.7660645, -136.8020881 …  │
└──────────────────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

very strange..

Python 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.9.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import sys

In [2]: sys.version
Out[2]: '3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]'

In [3]: import duckdb

In [4]: con = duckdb.connect("test2.db")

In [5]: con.sql(" INSTALL spatial ; INSTALL httpfs  ; LOAD spatial ; LOAD httpfs ; ")

In [6]: sql_query = ("SELECT id, "
   ...: "ST_GeomFromWKB(geometry) as geom "
   ...: "FROM read_parquet('s3://overturemaps-us-west-2/release/2024-04-16-beta.0/theme=buildings/type=*/*', "
   ...: "filename=true, "
   ...: "hive_partitioning=1) "
   ...: "LIMIT 5; ")

In [7]: con.sql(sql_query).show()
100% ▕████████████████████████████████████████████████████████████▏ 
Segmentation fault (core dumped)

@jwass I tested with another computer and I don't have the problem. The problem comes from my environment, I think we can close...

Thanks for your help.

Hello @jwass, has the correction been made?

Hi @florentfgrs - From your last comment, it seemed like the issue had been fixed? (#144 (comment))