predict-idlab/tsdownsample

Port to UDF/SQLAlchemy

jayceslesar opened this issue ยท 5 comments

Definitely a non-trivial task but there is a growing library Ibis which aims to compile pandas-like syntax python to many different SQL backends. UDF's are supported for many SQL backends here so would be super interesting to be able to somehow port the functionality of this library to SQLAlchemy to be used with any database. There are some resources on how this was done in here but moving to something like Ibis/SQLAlchemy would be awesome as right now this is only really available in TimescaleDB

jvdd commented

(quote from #29)
Would also be interesting to integrate this into polars using the UDf/Pipe methodology but might want a separation of concerns there as this library does what it needs to really well and polars does what it needs to really well and ideally all the magic happens on the rust side of things anyways

I am considering updating this library to make it more flexible. The argminmax Rust crate is the beating heart of this library, and I recently updated it to support slices, vec, and comply with Apache Arrow while also adding nan-handling capabilities. I plan to propagate these changes as soon as I find the time. Unfortunately, my spare time is limited for the next 1.5 months due to some paper deadlines (where we - including @jonasvdd - will present some exciting new findings on time series downsampling ๐Ÿ˜‰).

I am not acquianted with the UDF/pipe mehthodology, but I'll certainly look into it! Thank you for bringing this to my attention ๐Ÿค

P.S.: the polars author also expressed his interest in integrating the argminmax project into polars: jvdd/argminmax#22

I am considering updating this library to make it more flexible. The argminmax Rust crate is the beating heart of this library, and I recently updated it to support slices, vec, and comply with Apache Arrow while also adding nan-handling capabilities. I plan to propagate these changes as soon as I find the time. Unfortunately, my spare time is limited for the next 1.5 months due to some paper deadlines (where we - including @jonasvdd - will present some exciting new findings on time series downsampling ๐Ÿ˜‰).

I am in a similar boat where I need to also crank out a few papers for my coursework haha! Need to stop procrastinating but hoping to use this library as an example for one of the papers :D

jvdd commented

When working on predict-idlab/plotly-resampler#154 - I realized that instead of pandas, we were now coupling our downsampling approach to numpy... Implementing to smth like IBIS would serve great flexibility & allow out-of-core support! I am just not 100% sure if this can achieve the same runtime performance as the current numpy implementation ๐Ÿค”

This is definitely on my radar to investigate in the near future :)

jvdd commented

On another note, if you have finished any visualization-related papers that you are able to share, @jonasvdd & I would love to read & learn from it :)

P.S.: our 1st paper is just submitted ๐Ÿš€
-> preprint: https://arxiv.org/abs/2304.00900