Dimension of source embedding must be 1 to be applicable with surrogate methods ?
Datseris opened this issue · 2 comments
I am confused about the surrogate significance tests...
I get Dimension of source embedding must be 1 to be applicable with surrogate methods
if I do something like
embedding = EmbeddingTE(; dS = 3, dT = 3, dC = 3)
estimator = FPVP()
test = SurrogateTest(TEShannon(; embedding), estimator; nshuffles = 100)
and then call independence
. How does this limitation makes sense from a scientific perspective? Surely estimating the transfer entropy with only 1 dimension (and hence, no going into the past at all) from the source doesn't really make sense given the definition of transfer entropy, right?
How does this limitation come about? Why is it not possible to first shuffle the source timeseries and then embed it (which is what I would expect would happen)?
This is also related to JuliaDynamics/TimeseriesSurrogates.jl#136.
Surely estimating the transfer entropy with only 1 dimension rom the source doesn't really make sense given the definition of transfer entropy, right?
It does, because TE(x-y) = CMI(y(t+1), x(t)^(-) | y(t)^(-))
, where ^(-)
indicated an embedding with negative lags. The embedding may be 1-dimensional, meaning that you just use the raw time series. This is perfectly valid, and for very short time series, a 1-dimensional embedding for each marginal is the only reasonable thing to do.
Why is it not possible to first shuffle the source timeseries and then embed it (which is what I would expect would happen)?
This is a design choice. In the current implementation, I decided to shuffle the relevant marginal after embedding (I think because I figured it would save some allocations of new StateSpaceSet
s for all marginals for every surrogate realization). It is also, of course, possible to shuffle before embedding. I think we should enable both approaches.
In fact, we have to enable both approaches to not be too restrictive here. It may happens that one wants to consider multiple timeseries together as the source, and then one would have to use multidimensional surrogates, like described in JuliaDynamics/TimeseriesSurrogates.jl#136