tudelft-cda-lab/SAGE

Add a flag for CPTC dataset

Closed this issue · 0 comments

Description

In CPTC-2017 and CPTC-2018 datasets, attacker IPs are known, however this might not be the case for other datasets (e.g. CCDC). Because of that, some parts of the code that are CPTC-specific have to be commented out when using, for example, CCDC dataset.

For example:

image

Furthermore, the code snippet above is executed after learning the S-PDFA, which is too late.

Proposed solution

Move this check from make_state_sequences into group_alerts_per_team (in sage.py):

  1. Add the check for 10.0.254 in src_ip or in dst_ip - if not present, then discard
  2. If present in src_ip, then add (src_ip, dst_ip). If in dst_ip, then add (dst_ip, src_ip)
  3. Correspondingly update the part in make_state_sequences function

For the future, we might want to address internal paths (leave this as a TODO).

bad_ip can be renamed to cptc_bad_ip

Furthermore, add a specific flag for the dataset (enum or a string) and add this flag to the if-check, so that it is triggered only for the CPTC dataset. In PR #35, ArgumentParser will be used to parse this option or set the default one.

UPDATE: PR #35 has already added the --dataset option. In this PR, this option only has to be added to the correct places.