Deep Reinforcement Ant Colony Optimization
Swarm learning algorithm
The extremely large number of possible states in the real environment limited the implementation of reinforcement learning algorithms until recent times. Hence, such algorithms were implemented only for simple tasks in tabular grid worlds. Deep learning have expanded the use of such algorithms for multidimensional and complex virtual environments of computer video games. Modern deep learning and multi-agent reinforcement learning are actively porting classical machine learning algorithms to neural network architecture. Combining these trends, we propose a new deep reinforcement learning algorithm based on the traditional ant colony optimization algorithm for solving the problem of cooperative homogeneous swarm learning. The algorithm shapes the collective behavior of a decentralized system comprising a set of independent homogeneous agents, locally interacting with each other and the environment. The algorithm also represents an alternative stigmergic approach to implementing the leading multi-agent technology of centralized learning with decentralized execution. Local and often random interactions in the process of centralized learning lead to the emergence of agent swarm collective behavior, uncontrolled by individual agents in the process of real-world operation. We study the advantages of the algorithm in an experimental virtual environment of The StarCraft Multi-Agent Challenge.