cage-challenge/cage-challenge-1

"done" is always false, so the environment will never reset

yyzpiero opened this issue · 1 comments

determine_done will not change the done flag of the simulation (always return false), thus the environment will never reset.

def determine_done(self,
                       agent_obs: dict,
                       true_obs: dict,
                       action: Action) -> bool:
        """Determine if environment scenario goal has been reached.

        Parameters
        ----------
        agent_obs : dict
            the agents last observation
        true_obs : dict
            the current white state
        action : Action
            the agents last action performed

        Returns
        -------
        bool
            whether goal was reached or not
        """
        return False

When the goal was reached, e.g. the target host has been Impacted, the done flag should flip to True. (Surely, the "Red" in this challenge will continuously "Impact" the target host, however, it is important for GYM wrapped environment to be able to stop at a certain point )

The intention is for the environment to run for a fixed number of steps. This allows the network defender to eject the attacker even after the Operational Server has been impacted. The attacker will keep trying to retake the server until it succeeds and the defender will need to learn how to best manage this.

The evaluation file currently uses 30, 50 and 100 steps as benchmark time intervals.