wuyifan18/DeepLog

Two concerns about DeepLog

hayhan opened this issue · 0 comments

I have two questions / concerns about the DeepLog.

Question 1:
The example about false positive detection is as below in the paper section 3.3.:
{k1, k2, k3 -> k1} and {k1, k2, k3 -> k2}. The former has been trained with normal logs and the latter one has false positive in detection phase. Then online updating the model with the latter one will fix the issue.

In the real world, there are cases that the sequence pattern itself (without considering the target) is abnormal and those sequences can never appear in the normal training dataset of course.
E.g. suppose h=3, and {k1, k3, k2} is a known abnormal sequence.
In this case, the detection result will not be reliable, e.g. the probability of {k1, k3, k2 -> k*} at output layer might be high.

How does DeepLog handle this abnormal sequence pattern?

Question 2:
For the online update of model, it looks DeepLog author used a small amount of train dataset that includes the FP instances for the updating. I have a concern that If I use a large train dataset to update the model, I will have the catastrophic forgetting issue that general incremental learning is facing. However, it is not a big issue as offline update from scratch with the accumulated dataset is always a backup.

How does DeepLog handle the catastrophic forgetting issue in the incremental model update?