/geopolars

Geospatial extensions for Polars

Primary LanguageRustMIT LicenseMIT

GeoPolars

geopolars

Geospatial DataFrames for Rust and Python

Test PyPI Package version Downloads

Update (September 2023): Preparatory work for managing geospatial data in Apache Arrow memory is ongoing in kylebarron/geoarrow-rs. GeoPolars itself is likely to see development again in early 2024.

Summary

GeoPolars extends the Polars DataFrame library for use with geospatial data.

  • Uses GeoArrow as the internal memory model.
  • Written in Rust
  • Bindings to Python (and WebAssembly in the future)
  • Multithreading capable

At this point, GeoPolars is a prototype and should not be considered production-ready.

Use from..

Rust

GeoPolars is published to crates.io under the name geopolars.

Documentation is available at docs.rs/geopolars.

Python

An early alpha (v0.1.0-alpha.4) is published to PyPI:

pip install --pre geopolars

The publishing processs includes binary wheels for many platforms, so it should be easy to install, without needing to compile the underlying Rust code from source.

WebAssembly

Polars itself does not yet exist in WebAssembly, though there has been discussion about adding bindings for it. The long-term goal of GeoPolars is to have a WebAssembly API as well.

Comparison with GeoPandas

Imitation is the sincerest form of flattery! GeoPandas — and its underlying libraries of shapely and GEOS — is an incredible production-ready tool.

GeoPolars is nowhere near the functionality or stability of GeoPandas, but competition is good and, due to its pure-Rust core, GeoPolars will be much easier to use in WebAssembly.

Future work

The biggest pieces of future work are:

  • Store geometries in the efficient Arrow-native format, rather than as WKB buffers (as the prototype currently does). This is blocked on Polars, which doesn't currently support Arrow FixedSizeList data types, but they've recently expressed openness to adding minimal FixedSizeList support.

  • Enable georust/geo algorithms to access Arrow data with zero copy. The prototype currently copies WKB geometries into geo structs on each geometry operation, which is expensive.

    This is blocked on adding support to the geo library for geometry access traits, which is a large undertaking. See georust/geo/discussions/838. I've started exploration on this

  • Implement GeoArrow extension types for seamless handling of CRS metadata in Rust, rather than in the Python wrapper.