Building a Generic Panel Data Interface
Opened this issue · 4 comments
Some things that might be necessary for a generic panel data interface:
- functions returning different views of a panel dataset, so that time series can be viewed as either a vector of time series or a time series of vectors, similar to
obsview
in MLUtils. Perhapscross_section
andtime_series
? - a function for indexing/viewing/subsetting a time series given a date+group id
- some basic statistics, like estimators of mean, variance, covariance, and autocorrelation
- unsure if this should be included, but maybe ways to indicate whether the data are assumed stationary and/or ergodic? (Since this will affect many calculations, including the ones above).
All these functionalities seem to be better placed in TreatmentPanels.jl as (at least currently) I do not see anything that requires changes in DataFrames.jl (but I might be wrong - let us see how the thread evolves).
All these functionalities seem to be better placed in TreatmentPanels.jl as (at least currently) I do not see anything that requires changes in DataFrames.jl (but I might be wrong - let us see how the thread evolves).
I actually agree. In fact, I think DataFrames.jl already mostly implements everything we'd need. The only thing I think would be good to add in DataFrames.jl is a constructor taking a grouped DataFrame where one of the columns is a date.
a constructor taking a grouped
DataFrame
where one of the columns is a date.
What output would you expect then? DataFrame
constructor already accepts GroupedDataFrame
as an argument.
a constructor taking a grouped
DataFrame
where one of the columns is a date.What output would you expect then?
DataFrame
constructor already acceptsGroupedDataFrame
as an argument.
It'd output a Panel
/TreatmentPanel
, preferably without copying the underlying data -- I assume it should be possible to allow for an arbitrary underlying data source using the Table
interface and something like MLUtils.