tudelft-cda-lab/SAGE

Structure discarding short episode subsequences

jzelenjak opened this issue · 1 comments

Description

The break_into_subbehaviors function, which is responsible for cutting episode sequences into episode subsequences, discards short subsequences, for example:
image

This check occurs multiple times within the function and it is also present in generate_traces function. Ideally, it should be only in one place (in generate_traces).

Furthermore, when splitting, if a sequence goes like [low, low, medium, high, low], then [low, low, medium, high] is saved but the last [low] is just discarded. We probably shouldn't lose alerts like this. On the other hand, it is not clear what to do with a single event either. Maybe we keep them regardless?

Proposed solution

  • Leave only the check in the generate_traces function and update the break_into_subbehaviors function accordingly. The resulting attack graphs should be the same as before.
  • For now, these short episodes can be discarded, as they used to be, and it will be left to the user whether they want to keep them or not.

This is a bit tricky one, so it needs some discussion.

I think that this part can be safely removed:
image

For break_into_subbehaviors method, it is not that easy. A potential change could be something as follows.

This is the original:

image

This is the modified:
image

  1. Because there have to be at least two episodes for a cut, sequences with one episode have to be either discarded or stored immediately, before cutting.
  2. I don't know what to do with this part
    image
  3. The pieces part could be removed, so that it also addresses issue #28