openai/multiagent-particle-envs

Simple tag scenario multiple adding of rewards for adversaries agents

GoingMyWay opened this issue · 0 comments

In environment.py

    def step(self, action_n):
        ... 
        reward_n = []
        ...  
        for agent in self.agents:
            ... 
            reward_n.append(self._get_reward(agent))
            ... 
        # all agents get total reward in cooperative case
        reward = np.sum(reward_n)
        if self.shared_reward:  # simple tag is not shared reward
            reward_n = [reward] * self.n

        return  ...  reward_n,    ... 

For each agent, it will calculate its corresponding rewards. In the simple_tag.py

    def adversary_reward(self, agent, world):
        # Adversaries are rewarded for collisions with agents
        rew = 0
        shape = False
        agents = self.good_agents(world)
        adversaries = self.adversaries(world)
        if shape:  # reward can optionally be shaped (decreased reward for increased distance from agents)
            for adv in adversaries:
                rew -= 0.1 * min([np.sqrt(np.sum(np.square(a.state.p_pos - adv.state.p_pos))) for a in agents])
        if agent.collide:
            for ag in agents:
                for adv in adversaries:
                    if self.is_collision(ag, adv):
                        rew += 10
        return rew

agent.collide is always true. If the agent is an adversary agent, the coder iterates all adversary agents. The results in multiple adding of reward for all adversary agents.