"done" is always false, so the environment will never reset
yyzpiero opened this issue · 1 comments
determine_done
will not change the done
flag of the simulation (always return false), thus the environment will never reset.
def determine_done(self,
agent_obs: dict,
true_obs: dict,
action: Action) -> bool:
"""Determine if environment scenario goal has been reached.
Parameters
----------
agent_obs : dict
the agents last observation
true_obs : dict
the current white state
action : Action
the agents last action performed
Returns
-------
bool
whether goal was reached or not
"""
return False
When the goal was reached, e.g. the target host has been Impacted, the done flag should flip to True. (Surely, the "Red" in this challenge will continuously "Impact" the target host, however, it is important for GYM wrapped environment to be able to stop at a certain point )
The intention is for the environment to run for a fixed number of steps. This allows the network defender to eject the attacker even after the Operational Server has been impacted. The attacker will keep trying to retake the server until it succeeds and the defender will need to learn how to best manage this.
The evaluation file currently uses 30, 50 and 100 steps as benchmark time intervals.