google-deepmind/PGMax

Suggestions for modelling

Opened this issue · 0 comments

I am working on a resource allocation problem using PGMax. In this problem, I have m agents and n tasks, where
n>m. I need to assign each agent to a task, and ideally, each agent should be matched with the closest available task on a 2D grid to minimize the distance. My constraints are that no two agents can be assigned the same task, and agents have the option to choose no task at all, which incurs a penalty.

I have implemented a model using PGMax, but I am unsure if my implementation is correct. Specifically, I’ve noticed that the output values of the variables remain unchanged even when I modify the cost structure, such as the distances between agents and tasks. Additionally, there are instances where multiple agents end up being assigned to the same task, which is not desired.

Any guidance or advice would be greatly appreciated. I have attached my code for further review.

import jax
import jax.numpy as jnp
import matplotlib.pyplot as plt
import numpy as np

# Load PGMax
from pgmax import fgraph, fgroup, infer, vgroup

# Parameters
m = 3  # Number of agents
n = 10 # Number of tasks (n > m)
penalty =  -1e6  # Large penalty for selecting the same task
idle_penalty = -1000 # Penalty for staying idle

# Simulate 2D positions for agents and tasks
np.random.seed(42)
agent_positions = np.random.rand(m, 2) * 10  # m agents in a 10x10 space
task_positions = np.random.rand(n, 2) * 10  # n tasks in a 10x10 space

# Calculate the cost matrix (m x n) for agents selecting tasks
cost_matrix = np.zeros((m, n+1))
for i in range(m):
    for j in range(n+1):
        if j!=n:
          cost_matrix[i, j] = 1/np.linalg.norm(agent_positions[i] - task_positions[j])
        else:
          cost_matrix[i, j] = idle_penalty

# Define the variables
variables = vgroup.NDVarArray(num_states=n + 1, shape=(m,))

# Create the Factor Graph
fg = fgraph.FactorGraph(variable_groups=[variables])

# Create the Factor Graph
fg = fgraph.FactorGraph(variable_groups=[variables])

# Add unary factors
for ii in range(m):
  unary_factor = factor.EnumFactor( variables=[variables[ii]],
      factor_configs=np.arange(n+1)[:, None],
      log_potentials=cost_matrix[ii,:],)
  fg.add_factors(unary_factor) #Unary log potential is inverse to the distance

# Create the pairwise factor group
variables_for_factors = []
factor_matrices = []

for i in range(m):
    for j in range(i + 1, m):
        variables_for_factors.append([variables[i], variables[j]])

# Create a factor matrix for the pair (i, j)
factor_matrix = np.zeros((n + 1, n + 1))
for state_i in range(n + 1):
    for state_j in range(n + 1):
        if state_i == state_j and state_i != n:  # Same task (not idle)
            factor_matrix[state_i, state_j] = penalty # A higher penalty if two agents select same tasks.
        else:
            factor_matrix[state_i, state_j] = 0

# Create the Pairwise Factor Group
factor_group = fgroup.PairwiseFactorGroup(variables_for_factors=variables_for_factors,
    log_potential_matrix=factor_matrix,
)

# Add the pairwise factors to the factor graph
fg.add_factors(factor_group)

# Update the evidence
rng = jax.random.PRNGKey(2)
evidence_updates={variables: jax.random.gumbel(rng, shape=(m, n+1))}

# Perform Loopy Belief Propagation
bp = infer.build_inferer(fg.bp_state, backend="bp")

# Run MAP inference
inferer_arrays = bp.init(evidence_updates=evidence_updates)
inferer_arrays, msgs_deltas = bp.run_with_diffs(inferer_arrays, num_iters=2000, temperature=0.0)

# Compute the beliefs
beliefs = bp.get_beliefs(inferer_arrays)

print(beliefs)
# Extract the marginals (beliefs)
marginals = infer.decode_map_states(beliefs)
print(marginals)