Constantly getting a key error- 'all', cant figure out the source

Question

Constantly getting a key error- 'all', cant figure out the source

aron-alarik opened this issue a year ago · 6 comments

Hi I am using the Parallel environment base class for creating a MARL environment. However when using the parallel_api_test utility I am constantly getting this error:

parallel_api_test(env, num_cycles=1_000_000)
File "C:\Users\Ant PC\AppData\Roaming\Python\Python311\site-packages\pettingzoo\test\parallel_test.py", line 125, in parallel_api_test
live_agents.remove(agent)
KeyError: 'all'

I have checked my step functions and my reset function, but getting nowhere !

this is my step function:

def step(self, actions):

    next_observations = {}
    rewards = {}
    dones = {}
    truncateds = {}  # New truncated flag
    info = {}
    
    for agent_id, action in actions.items():
        next_observations[agent_id], rewards[agent_id], dones[agent_id], info[agent_id] = self._compute_outcomes(agent_id, action, actions)
       


        if self.current_step >= self.max_steps:
            
            truncateds[agent_id] = True
            
        else:
            truncateds[agent_id] = False
        
        for agent_id, obs in next_observations.items():
            next_observations[agent_id] = obs.astype(np.float32)
        #print(f"Agent: {agent_id}, Observation Shape: {next_observations[agent_id].shape}")
        
        # Store experiences in the shared buffer 
        print (f'agent_id, action: {agent_id}, {action}')
        experience = (self.current_observations[agent_id], action, rewards[agent_id], next_observations[agent_id])
        self.shared_replay_buffer.append(experience)
    
    # Update the current observations
    self.current_observations = next_observations
    
    truncateds['__all__'] = np.any(list(truncateds.values()))  
    dones['__all__'] = np.any(list(dones.values()))
    
    # Remove agents that are done from the environment        
    for agent_id in list(self.agents):  
        if dones[agent_id] or truncateds[agent_id]:
            self.agents.remove(agent_id)
            print (f'agent_id: {agent_id} removed')
    
    print (self.current_step, dones)
    print (self.current_step, truncateds)
    
    self.current_step += 1
    
    return next_observations, rewards, dones, truncateds, info

Answer 1 · 2023-09-12T19:47:54.000Z

I don’t have a chance to fully look this through right now but my guess is it has to do with how you are removing agents, take a look at https://pettingzoo.farama.org/content/environment_creation/#example-custom-parallel-environment for a simple example of a parallel env to base your code off. Another thing is it may be easier to simply create an AEC env and wrap it to convert to parallel, as that is what many of our environments do because it’s simpler to perform all the logic. Scroll up in the above link to the AEC code to see the comparison.

Let me know if this is of any help, could take a look at some point this weekend maybe if not.

Answer 2 · 2023-09-13T10:27:05.000Z

okay thanks, big relief , I rewrote the functions for AEC and converted them to parallel:

#api_test(env)

env_parallel = aec_to_parallel(env)

test parallel env

parallel_api_test(env_parallel, num_cycles=10_000)

it passed both the initial tests, lets see if they work with Stable baselines PPO... cheers !!

Answer 3 · 2023-09-13T14:01:27.000Z

Great to hear. For what it’s worth, if you want multi-agent algorithms AgileRL and RLlib are likely the best options currently. SB3 I effectively just hacked to work for multi-agent environments. Agile we have a new tutorial for here https://pettingzoo.farama.org/main/tutorials/agilerl/, and RLlib here though you will need to use the nightly release of ray https://pettingzoo.farama.org/main/tutorials/rllib/ and https://docs.ray.io/en/latest/ray-overview/installation.html#daily-releases-nightlies

Answer 4 · 2023-09-13T16:24:23.000Z

Thanks !! yes I just learnt that about SB3... the hard way !! The real issue with my environment is that all agents have a different and independent observation space and hence the struggle with SB3, its more of a comparison between agents with different handicaps rather than co-operation.

Agile looks really promising, giving it a try now, and thanks for the tip on Ray, it just saved me a few hours, I was just starting to experiment with Ray.

The last goal would be to send all agent generated messages to an LLM....I will try these out and keep you posted... thanks again !!

I just joined the discord server as well..

Answer 5 · 2023-09-13T16:59:32.000Z

Ray actually just recently had a PR which allows for different spaces with observations and actions I believe, not merged yet AFAIK but they said in the 2.8 release it will be, which they expect to be in early October (2 weeks in or so). But you can install the Ray nightly as soon as that pr is merged, or build it locally from that PR. ray-project/ray#39459

I recommend using agile for now then doing that PR when it’s merged if agile doesn’t work. Post in our discord if you have questions about agile, a few devs there have helped with questions from other users.

Answer 6 · 2023-09-13T20:04:03.000Z

And about the LLM bit, we have an example using LangChain with PettingZoo on our tutorials, as well as ChatArena showing LLM agents interacting with PettingZoo envs. There aren’t many examples of environments with text action spaces. Minigrid uses it for observation space as it contains a description of the task, and I have some environments for a not yet public project, but if you have questions about it be sure to ask in discord.

Closing this as it seems like the issue was fixed, though it could potentially serve us to make the API test error a bit better, but I don’t really know what caused the issue exactly so it’s hard to move forwards.