Simple tag scenario multiple adding of rewards for adversaries agents
GoingMyWay opened this issue · 0 comments
GoingMyWay commented
def step(self, action_n):
...
reward_n = []
...
for agent in self.agents:
...
reward_n.append(self._get_reward(agent))
...
# all agents get total reward in cooperative case
reward = np.sum(reward_n)
if self.shared_reward: # simple tag is not shared reward
reward_n = [reward] * self.n
return ... reward_n, ...
For each agent, it will calculate its corresponding rewards. In the simple_tag.py
def adversary_reward(self, agent, world):
# Adversaries are rewarded for collisions with agents
rew = 0
shape = False
agents = self.good_agents(world)
adversaries = self.adversaries(world)
if shape: # reward can optionally be shaped (decreased reward for increased distance from agents)
for adv in adversaries:
rew -= 0.1 * min([np.sqrt(np.sum(np.square(a.state.p_pos - adv.state.p_pos))) for a in agents])
if agent.collide:
for ag in agents:
for adv in adversaries:
if self.is_collision(ag, adv):
rew += 10
return rew
agent.collide
is always true. If the agent is an adversary agent, the coder iterates all adversary agents. The results in multiple adding of reward for all adversary agents.