/UCB_MARL

The simulation codes of a provably efficient multi-agent reinforcement learning algorithm with a near-optimal regret bound in industrail data collection.

Primary LanguagePython

Stargazers