Can `GeoDataFrame.crs` set `None`?
amano-takahisa opened this issue · 5 comments
I would like to use dask-geopandas.GeoDataFrame
for non-geospatial data as well.
Therefore, I tried to drop CRS data by assign None
to GeoDataFrame.crs
as follows, which worked on geopandas
.
import dask_geopandas as dask_gpd
import geopandas as gpd
from shapely import Point
d = {
'col1': ['name1', 'name2'],
'geometry': [Point(1, 2), Point(2, 1)],
}
gdf = gpd.GeoDataFrame(d, crs='EPSG:4326')
dask_gdf = dask_gpd.from_geopandas(gdf)
dask_gdf.crs = None
The above raised the following.
Traceback (most recent call last):
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/dask/dataframe/utils.py", line 195, in raise_on_meta_error
yield
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/dask_expr/_expr.py", line 3987, in _emulate
return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/dask_geopandas/expr.py", line 104, in _set_crs
return df.set_crs(crs, allow_override=allow_override)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/geopandas/geodataframe.py", line 1325, in set_crs
df.geometry = df.geometry.set_crs(
^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/geopandas/geoseries.py", line 1080, in set_crs
raise ValueError("Must pass either crs or epsg.")
ValueError: Must pass either crs or epsg.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/dask_expr/_collection.py", line 3029, in __setattr__
object.__setattr__(self, key, value)
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/dask_geopandas/expr.py", line 267, in crs
new = self.set_crs(value, allow_override=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/dask_geopandas/expr.py", line 273, in set_crs
new = self.map_partitions(
^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/dask_expr/_collection.py", line 1090, in map_partitions
return map_partitions(
^^^^^^^^^^^^^^^
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/dask_expr/_collection.py", line 6106, in map_partitions
return new_collection(new_expr)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/dask_expr/_collection.py", line 4764, in new_collection
meta = expr._meta
^^^^^^^^^^
File "/usr/lib/python3.12/functools.py", line 995, in __get__
val = self.func(instance)
^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/dask_expr/_expr.py", line 630, in _meta
return _get_meta_map_partitions(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/dask_expr/_expr.py", line 4001, in _get_meta_map_partitions
meta = _emulate(func, *a, udf=True, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/dask_expr/_expr.py", line 3986, in _emulate
with raise_on_meta_error(funcname(func), udf=udf):
File "/usr/lib/python3.12/contextlib.py", line 158, in __exit__
self.gen.throw(value)
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/dask/dataframe/utils.py", line 216, in raise_on_meta_error
raise ValueError(msg) from e
ValueError: Metadata inference failed in `_set_crs`.
You have supplied a custom function and Dask is unable to
determine the type of output that that function returns.
To resolve this please provide a meta= keyword.
The docstring of the Dask function you ran should have more information.
Original error is below:
------------------------
ValueError('Must pass either crs or epsg.')
Traceback:
---------
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/dask/dataframe/utils.py", line 195, in raise_on_meta_error
yield
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/dask_expr/_expr.py", line 3987, in _emulate
return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/dask_geopandas/expr.py", line 104, in _set_crs
return df.set_crs(crs, allow_override=allow_override)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/geopandas/geodataframe.py", line 1325, in set_crs
df.geometry = df.geometry.set_crs(
^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/work/.venv/lib/python3.12/site-packages/geopandas/geoseries.py", line 1080, in set_crs
raise ValueError("Must pass either crs or epsg.")
Is there a way to delete a CRS already set up?
My environment was as follows.
$ pip list
Package Version
---------------- -----------------------
attrs 23.2.0
certifi 2024.6.2
click 8.1.7
click-plugins 1.1.1
cligj 0.7.2
cloudpickle 3.0.0
dask 2024.5.2
dask-expr 1.1.2
dask-geopandas 0+untagged.162.gaa1b52f
distributed 2024.5.2
fiona 1.9.6
fsspec 2024.6.0
geopandas 0.14.4
Jinja2 3.1.4
locket 1.0.0
MarkupSafe 2.1.5
msgpack 1.0.8
numpy 1.26.4
packaging 24.1
pandas 2.2.2
partd 1.4.2
pip 24.0
psutil 5.9.8
pyarrow 16.1.0
pyproj 3.6.1
python-dateutil 2.9.0.post0
pytz 2024.1
PyYAML 6.0.1
shapely 2.0.4
six 1.16.0
sortedcontainers 2.4.0
tblib 3.0.0
toolz 0.12.1
tornado 6.4.1
tzdata 2024.1
urllib3 2.2.1
zict 3.0.0
$ python -V
Python 3.12.3
$ cat /etc/os-release
PRETTY_NAME="Ubuntu 24.04 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo
Bump, I am having the same problem. The Issue i have does not need to be resolved using CRS as it is a purely planar calculation within an image where the projection doesnt matter.
I believe that the snippet above no longer raises an error with the latest versions of dask-geopandas and geopandas. Though I would generally suggest using set_crs
instead.
gdf = gpd.GeoDataFrame(d, crs='EPSG:4326')
dask_gdf = dask_gpd.from_geopandas(gdf).set_crs(None, allow_override=True)
@toihr are you using the latest geopandas and dask-geopandas?
I am on version 0.4.1 on dask-geopandas and for some reason I am on geopandas 0.14.3 that seems odd. Thanks for pointing that out
Yeah, I believe that you will need the changes we made in set_crs
in geopandas 1.0 to make this work.
Yeah I think that might help thank you very much i think you can close this issue then.