Request: Implement `_dump_data` so that `Rails.cache` can wrap a DataFrame
DeflateAwning opened this issue · 3 comments
DeflateAwning commented
Request: Implement _dump_data
so that Rails.cache
can wrap a DataFrame
Currently, the following error is returned:
TypeError (no _dump_data is defined for class Polars::RbDataFrame):
Test case (roughly, may require some rework)
def call
cache_key = "expensive_operation_1"
Rails.cache.fetch(cache_key, expires_in: 6.hour) do
call_no_cache_expensive_operation_1 # Returns a dataframe
end
end
DeflateAwning commented
Here's my current workaround, which isn't ideal:
def call
# Cache the result.
df_as_parquet_base64 = Rails.cache.fetch("cache_key", expires_in: 6.hour) do
df = call_no_cache
# Serialize the DataFrame to bytes then base64. Sorta a long-winded way for now, but it works.
# TODO: Try this method: https://github.com/ankane/ruby-polars/issues/79
df_pq_tempfile_stringio = StringIO.new
df.write_ipc df_pq_tempfile_stringio
df_pq_tempfile_stringio.rewind
# Read, and encode in base64.
as_b64 = Base64.encode64(df_pq_tempfile_stringio.read)
as_b64
end
df = Polars.read_ipc(StringIO.new(Base64.decode64(df_as_parquet_base64)))
end
ankane commented
Hi @DeflateAwning, I'm not sure I'd like to support marshal serialization (as it's not secure for untrusted data), but you could add it to your own application with:
class Polars::DataFrame
def _dump(level)
write_ipc(nil)
end
def self._load(bin)
Polars.read_ipc(StringIO.new(bin))
end
end
DeflateAwning commented
Hmm, even with that, I'm still getting TypeError (no _dump_data is defined for class Polars::RbDataFrame):