Confused about setting multiple spatial goals in the A2C demo

Question

Confused about setting multiple spatial goals in the A2C demo

ryanbadman opened this issue 7 months ago · 4 comments

Hello, I am trying to have the agent have the option to have multiple goals in the arena. When I put in multiple position centers, they show up in the plot. But they only trigger rewards if there is exactly one goal, otherwise zero reward even when they reach the goal when multiple rewards are present. I've tried turning on nonsequential mode, using reset_n_goals = 2, and these don't work. No error, just no rewards or goal completions trigger.

For completion criterion right now I've extended the timeout time, and have it quit from anywhere from 10-100s of timesteps after a goal but it's not registering completion either with more than one goal even when I see the agent travel over both goals.

Also I am confused whether in goals = [SpatialGoal(env,pos=GOAL_POS,goal_radius=GOAL_RADIUS, reward=reward)], if I make GOAL_POS a list of np arrays (one [X.Y] array per goal position), do I have to make GOAL_RADIUS and reward arrays of the same length? I've tried it both ways, keeping the latters scalar and making them all the same list length and it doesn't work but also doesn't complain either way. It would be nice to have multiple goals present in one arena of different sizes and reward magnitude if it supports that, but it doesn't seem to work.

Answer 1 · 2024-04-03T07:45:05.000Z

Would you mind pasting the code block that you are running?

Example of two goals

# create the deck of cards/goals which could be drawn from
two_goals = [
               SpatialGoal( env,
                      pos=[0.4 * v, 0.4 * v],
                      goal_radius=0.1, 
                      reward=reward_object) 
                 for v in range(2)]

# assign the "card" deck to the environment's cache
env.goal_cache.reset_goals = two_goals  # pool to reset goals with

Running two agents produces the following reward curves for three episodes

Each spatial goal takes a single (x,y) position for the goal and a scalar for it's radius. Single spatial goal objects don't accept arrays.

Think of those two goals like a deck of cards which will be drawn from.

The reset number (reset_n_goals) is the number to draw from this deck into every episode. If you only have a deck of cards with one card, then drawing twice from it (without replacement) can only draw at most one unique goal. So check that your cache has at least two unique goals if you set the reset number to 2.

If you're still noticing zero reward delivered with this method, post a code snippet. Can't seem to reproduce on my end, but happy to debug.

Answer 2 · 2024-04-15T17:13:50.000Z

@ryanbadman did this solve your issues? If so I might close the issue soon

Answer 3 · 2024-04-15T17:17:17.000Z

Sorry for the delay, got busy with other things last week. Yes your solution fixed the original problem. Thanks!

Answer 4 · 2024-04-15T17:35:27.000Z

Great! Glad to hear. Thanks @SynapticSage for the fix. Closing the issue