Geospatial DataFrames for Rust and Python
Update (September 2023): Preparatory work for managing geospatial data in Apache Arrow memory is ongoing in kylebarron/geoarrow-rs. GeoPolars itself is likely to see development again in early 2024.
GeoPolars extends the Polars DataFrame library for use with geospatial data.
- Uses GeoArrow as the internal memory model.
- Written in Rust
- Bindings to Python (and WebAssembly in the future)
- Multithreading capable
At this point, GeoPolars is a prototype and should not be considered production-ready.
GeoPolars is published to crates.io under the name geopolars
.
Documentation is available at docs.rs/geopolars.
An early alpha (v0.1.0-alpha.4
) is published to PyPI:
pip install --pre geopolars
The publishing processs includes binary wheels for many platforms, so it should be easy to install, without needing to compile the underlying Rust code from source.
Polars itself does not yet exist in WebAssembly, though there has been discussion about adding bindings for it. The long-term goal of GeoPolars is to have a WebAssembly API as well.
Imitation is the sincerest form of flattery! GeoPandas — and its underlying libraries of shapely
and GEOS
— is an incredible production-ready tool.
GeoPolars is nowhere near the functionality or stability of GeoPandas, but competition is good and, due to its pure-Rust core, GeoPolars will be much easier to use in WebAssembly.
The biggest pieces of future work are:
-
Store geometries in the efficient Arrow-native format, rather than as WKB buffers (as the prototype currently does). This is blocked on Polars, which doesn't currently support Arrow
FixedSizeList
data types, but they've recently expressed openness to adding minimalFixedSizeList
support. -
Enable
georust/geo
algorithms to access Arrow data with zero copy. The prototype currently copies WKB geometries intogeo
structs on each geometry operation, which is expensive.This is blocked on adding support to the
geo
library for geometry access traits, which is a large undertaking. See georust/geo/discussions/838. I've started exploration on this -
Implement GeoArrow extension types for seamless handling of CRS metadata in Rust, rather than in the Python wrapper.