statnett/data_cache

Feature request: support for static frames

Closed this issue · 1 comments

The static-frames python package works like pandas but enforces immutability on its frames, which would benefit greatly from an integration with data_cache (in the hypothesis of a functional data analysis approach).
https://static-frame.readthedocs.io/en/latest/index.html

Hey! I think this is a bit out of scope for this project due to the current adaptation of static-frame however it should not be too hard to create your own store function like:

def store_func(
func_key: str,
arg_key: str,
func: cache_able_function,
f_args: Tuple[Any],
f_kwargs: Dict[str, Any],
metadata: Dict[str, str] = None,
) -> cached_data_type:
"""Retrieves stored data if key exists in stored data if the key is new, retrieves data from
decorated function & stores the result with the given key.
Args:
func_key: unique key generated from function source
arg_key: unique key generated from function arguments
func: original cached function
f_args: args to pass to the function
f_kwargs: kwargs to pass to the function
metadata: dictionary of metadata data to store alongside the data
Returns:
Data retrieved from the store if existing else from function
"""
file_path = get_path() / "data.h5"
path = f"/{func_key}/{arg_key}"
suffix = "/array" if issubclass(data_storer, h5py.File) else ""
with data_storer(file_path, mode="a") as store:
if store.__contains__(path):
if isinstance(store[path], h5py.Group) and "array" not in store[path].keys():
return tuple(
[store[f"{path}/{data_idx}{suffix}"][:] for data_idx in store[path].keys()]
)
return store[f"{path}{suffix}"][:]
data = func(*f_args, **f_kwargs)
with data_storer(file_path, mode="a") as store:
if isinstance(data, tuple):
for i, data_ in enumerate(data):
store.create_dataset(f"{path}/data{i}{suffix}", data=data_)
add_metadata(store[f"{path}/data{i}"], func, metadata)
else:
store.create_dataset(f"{path}{suffix}", data=data)
add_metadata(store[path], func, metadata)
return data
which you can use in conjunction with data_cache and for example to_hdf5/from_hdf5 in static-frame in order to achieve the results you want.

something like:

    def static_frames_store(
        func_key: str,
        arg_key: str,
        func: cache_able_function,
        f_args: Tuple[Any],
        f_kwargs: Dict[str, Any],
        metadata: Dict[str, str] = None,
    ):
        file_path = data_cache.get_path() / "static-frames.h5"
        path = f"/{func_key}/{arg_key}"

        # Try to return data with from_hdf5 

        data = func(*f_args, **f_kwargs)

        # Store data with to_hdf5

        return data

And then create your decorator:

static_frames_cache = cache_decorator_factory(static_frames_store)

@static_frames_cache
def your_function():
    ...