BUG: failure if manually specifying engine="pyarrow" in to_parquet
jorisvandenbossche opened this issue · 1 comments
jorisvandenbossche commented
I just noticed that when the argument engine="pyarrow" is provided to to_parquet()
the write still fails with the same error.
import pandas as pd
import geopandas as gpd
import dask_geopandas as dgpd
dft = pd.util.testing.makeDataFrame()
dft["geometry"] = gpd.points_from_xy(dft.A, dft.B)
df = gpd.GeoDataFrame(dft)
df = dgpd.from_geopandas(df, npartitions=1)
df.to_parquet("mydf.parquet", engine="pyarrow")
Originally posted by @FlorisCalkoen in #198 (comment)
jorisvandenbossche commented
Ah, that is "expected", because you are then using dask's built-in "pyarrow" engine, and we actually extend that engine to handle the geometry dtype properly.
But of course, we should avoid that people can accidentally pass engine="pyarrow"
and thus silently overwriting our own engine. Seems we need something more elaborate that the simple partial to do that:
dask-geopandas/dask_geopandas/io/parquet.py
Lines 97 to 98 in 2fd1646