Polars Expression plugins for R
eitsupi opened this issue · 5 comments
We needs:
- Mechanism for registering subnamespaces from outside the package something like https://docs.pola.rs/py-polars/html/reference/api.html
- Rust crate something like https://github.com/pola-rs/pyo3-polars
Note: Serialization and deserialization of R objects that may be needed are already defined here (I don't know if this is sufficient)
r-polars/src/rust/src/rbackground.rs
Lines 77 to 131 in 1ea820b
- Mechanism for registering subnamespaces from outside the package something like docs.pola.rs/py-polars/html/reference/api.html
I was able to make this work in an implementation that I am rewriting from scratch using py-polars as a reference.
https://github.com/eitsupi/neo-r-polars/blob/afac2ae8020e4dbe3d02f7515653a574283b577a/man/polars_api_register_series_namespace.Rd#L20-L44
# s: polars series
math_shortcuts <- function(s) {
# Create a new environment to store the methods
self <- new.env(parent = emptyenv())
# Store the series
self$`_s` <- s
# Add methods
self$square <- function() self$`_s` * self$`_s`
self$cube <- function() self$`_s` * self$`_s` * self$`_s`
# Set the class
class(self) <- "polars_namespace_series"
# Return the environment
self
}
polars_api_register_series_namespace("math", math_shortcuts)
s <- as_polars_series(c(1.5, 31, 42, 64.5))
s$math$square()$rename("s^2")
s <- as_polars_series(1:5)
s$math$cube()$rename("s^3")
The current concern is performance degradation due to frequent for loops (basically each call to a single method).
I believe the current implementation of r-polars registers all active bindings and methods when the package is installed, but it registers methods each time an R class instance is built, which would degrade performance (Of course, if it's acceptable, no problem)
https://github.com/eitsupi/neo-r-polars/blob/afac2ae8020e4dbe3d02f7515653a574283b577a/R/series-series.R#L7-L31
I have looked into this and it appears that this is accomplished by connecting to a dynamic library via the libloading crate.
https://docs.rs/libloading/latest/libloading/
https://github.com/pola-rs/polars/blob/5cad69e5d4af47e75ae0abbf88dc2bafbc8f66d2/crates/polars-plan/src/dsl/function_expr/plugin.rs#L5
In the case of R packages, it is the static libraries, not the dynamic libraries, that are built by rustc.
Dynamic libraries are built by R.
We need to find a way to generate the proper expected C ABI on the plugin side, but this is obviously beyond my knowledge.
In the case of R packages, it is the static libraries, not the dynamic libraries, that are built by rustc. Dynamic libraries are built by R.
We need to find a way to generate the proper expected C ABI on the plugin side, but this is obviously beyond my knowledge.
The recent libr
might be of use here: https://github.com/posit-dev/ark/tree/main/crates#readme
My understanding is that dynamic libraries are built by R, so it doesn't matter which Rust crate is chosen to build the static library.
The question here is that I don't know how to make a proper C ABI for the dynamic library created by R.