Usage of HawkesCumulantMatching
trouleau opened this issue · 3 comments
Hello,
I tried simulating long realizations of a multivariate Hawkes process with SimuHawkesExpKernels
and learning the parameters with HawkesCumulantMatching
but the estimation is not consistent (model.adjacency
even have small negative values)., as can be seen in the small toy example below.
from tick.hawkes.simulation import SimuHawkesExpKernels
from tick.hawkes import HawkesCumulantMatching
np.random.seed(123)
# Generate model parameters
n_nodes = 5
adjacency = np.random.binomial(n=1, p=np.log(n_nodes)/n_nodes, size=(n_nodes, n_nodes)) / 2
baseline = np.random.uniform(0.0, 0.1, size=n_nodes)
decays = 1.0
# Simulate the process
hawkes_simu = SimuHawkesExpKernels(adjacency=adjacency, decays=decays,
baseline=baseline, max_jumps=3e5,
verbose=False, seed=123)
hawkes_simu.simulate()
model = HawkesCumulantMatching(
integration_support=200.0,
C=100.0, penalty='elasticnet', elastic_net_ratio=0.95,
solver='adam', step=0.01, tol=1e-08, max_iter=1000)
model.fit(events=hawkes_simu.timestamps)
print(adjacency.round(2)) # ground truth
print(model.adjacency.round(2)) # estimation
print(np.max(np.abs(adjacency - model.adjacency))) # max estimation error
with output:
[[0.5 0. 0. 0. 0.5]
[0. 0.5 0.5 0. 0. ]
[0. 0.5 0. 0. 0. ]
[0.5 0. 0. 0. 0. ]
[0. 0.5 0.5 0. 0.5]]
[[ 0.54 0.23 0. -0. 0.28]
[ 0. 0.24 0.65 0.1 0.04]
[ 0. 0.39 0.02 0. 0.06]
[ 0.48 -0. 0. 0. 0. ]
[ 0.11 0.72 0. 0. 0.41]]
0.4987121740781347
The code was run with tick==0.6.0.0
and tensorflow==1.15.0
.
Is there a problem with this toy example?
Thank you!
I don't see anything straight forward.
Few suggestions:
- Have you tried checking that it ends up converging with more data (changing
max_jumps
) ? - Or is it an optimization problem ? do you get better results with another step_size or more iterations ? You can debug by looking at
model.get_history()['objective']
. - Otherwise, maybe the attributes are mispecified,
integration_support
or penalization is not adapted to the dataset.
Hope it helps !
Pinging @achab, the author of this algorithm
Thanks for pinging me @Mbompr ! Hello @trouleau, we met a few years ago in Zürich !
@Mbompr, your suggestions all make sense, I would go for the last one. The estimation is pretty sensible to integration_support
and this support should include the information you'd like to estimate, but shouldn't be too high.
When integration_support
is too high, you include a lot of noise in your estimation, which increases the variance of the estimator (this is encoded in the second part of condition 4) of Theorem 3, in this article). The integration_support
is the integration support of cumulants' densities: here you simulate Hawkes processes with an exponential decay of 1. Assuming that exponential densities is close to zero for after 5 times the decay, the covariance density and the skewness density should be close to zero respectively after +/- 5 times and +/- 10 times the decay. So I would rather go for integration_support = 20
. I acknowledge that this analysis is impossible to do when you don't know the kernels's supports: then you should use an integration_support
that roughly corresponds to the characteristic time scale of your system.
Plus, the theorem proves the consistency of the estimator, not the uniform convergence, so the maximum estimation error is too mean for that one.
Hope it helps !