nilshg/TreatmentPanels.jl

Building a Generic Panel Data Interface

Opened this issue · 4 comments

Some things that might be necessary for a generic panel data interface:

  • functions returning different views of a panel dataset, so that time series can be viewed as either a vector of time series or a time series of vectors, similar to obsview in MLUtils. Perhaps cross_section and time_series?
  • a function for indexing/viewing/subsetting a time series given a date+group id
  • some basic statistics, like estimators of mean, variance, covariance, and autocorrelation
  • unsure if this should be included, but maybe ways to indicate whether the data are assumed stationary and/or ergodic? (Since this will affect many calculations, including the ones above).

@bkamins

All these functionalities seem to be better placed in TreatmentPanels.jl as (at least currently) I do not see anything that requires changes in DataFrames.jl (but I might be wrong - let us see how the thread evolves).

All these functionalities seem to be better placed in TreatmentPanels.jl as (at least currently) I do not see anything that requires changes in DataFrames.jl (but I might be wrong - let us see how the thread evolves).

I actually agree. In fact, I think DataFrames.jl already mostly implements everything we'd need. The only thing I think would be good to add in DataFrames.jl is a constructor taking a grouped DataFrame where one of the columns is a date.

a constructor taking a grouped DataFrame where one of the columns is a date.

What output would you expect then? DataFrame constructor already accepts GroupedDataFrame as an argument.

a constructor taking a grouped DataFrame where one of the columns is a date.

What output would you expect then? DataFrame constructor already accepts GroupedDataFrame as an argument.

It'd output a Panel/TreatmentPanel, preferably without copying the underlying data -- I assume it should be possible to allow for an arbitrary underlying data source using the Table interface and something like MLUtils.