0-distances error when using PECUZAL embedding
xrisk opened this issue · 4 comments
Hi, we are trying to perform an embedding with some trace data we have from a simulation. I have attached the file consisting of the data points.
However, pezucal gives us:
Initializing PECUZAL algorithm for univariate input...
Starting 1-th embedding cycle...
Computed 0-distances. You might use model-data, thus try to add minimal additive noise to the signal you wish to embed and try again.
whereas optimal_traditional_de
gives us:
Algorithm stopped due to convergence of E₁-statistic. Valid embedding achieved ✓.
Stochastic signal, valid embedding NOT achieved ⨉.
We are not sure what could be the reason for this. Is it that our trace is too noisy, or that our system exhibits essentially random behavior?
We are new to the analysis of nonlinear systems, therefore any links to relevant literature / material would be appreciated. Thank you!
There are duplicate datapoints in your data, which typically happens in rounding that sensor output does. Try doing precicely what the error message says: add some small random noise to each datapoint. @hkraemer the error message is confusing with this "You might use model-data". Why don't we just say precisely what happens "THere are duplicate datapoints in your data" instead?
For the optimal_traditional_de
, there is nothing more I can tell you, the method says that it detected your signal to be similar to stochastic noise. You should analyze the output of delay_afnn
directly.
Yepp, you are right. I was initially "designing" this error message, because I had these issues only with artificial data, like @xrisk . I'll make a PR.
@xrisk , I looked at the data and it looks indeed very noisy. The auto-mutual information indicates very very low deterministic structure. This is also why optimal_traditional_de
raises the "stochastic"-alert. If I do
data = readdlm("points.txt")
data = vec(data)
data += 0.00000000001*randn(length(data))
theiler = estimate_delay(data,"mi_min")
Y, τ_pec, ts_pec, L, _ = pecuzal_embedding(data; τs = 0:100, w = theiler)
then PECUZAL executes, but -- as expected --, returns the single vector wihtout any embedding. - Because it seems to be too stochastic. May I ask from what kind of model your data is stemming? Maybe increasing the sampling time would help to "smooth" it?
@hkraemer thank you for your explanation. We are just doing some experimentation with implementations of some computer algorithms. We guessed that they may exhibit dynamical nature, however it seems either this is not true; or we are not capturing the correct metric from the system (or maybe there is just too much noise in our computer).
The auto-mutual information indicates very very low deterministic structure.
Can you please tell us how you measure this?
Anyway, thanks a lot for your help and for this great library 😄
Hey @xrisk ,
you can measure the auto-mutualinformation with the method selfmutualinfo()
. Simply type ? selfmutualinfo
in your REPL for more information. It is a convienent measure to estimate the decorrelation time by the first local minimum of the auto-mutualinformation (there are other "methods", however, e.g. the first zero-crossing of the auto-correlation), therefore DelayEmbeddings.jl
has the function estimate_delay()
, which automatically gives you this estimate. You can then play around with it using different methods, e.g. first zero-crossing of AC or so (type ? estimate_delay
in your REPL to see what is going on).