/POMO

codes for paper "POMO: Policy Optimization with Multiple Optima for Reinforcement Learning"

Primary LanguagePythonMIT LicenseMIT

POMO

This repository provides a reference implementation of POMO and saved trained models as described in the paper:

POMO: Policy Optimization with Multiple Optima for Reinforcement Learning
accepted at NeurIPS 2020
http://arxiv.org/abs/2010.16011

The code is written using Pytorch.

Basic Usage

To test run, use an application (e.g. Jupyter Notebook) to open ipynb files.
Train.ipynb contains codes for POMO training, which produces a model that you can save using torch.save()
Inference.ipynb contains codes for inference using saved models.
Examples of trained models are also provided in the folder named "result".

You can edit HYPER_PARAMS.py to change the size of the problem or other hyper-parameters before training.

Three example problems are solved:

  • Traveling Salesman Problem (TSP)
  • Capacitated Vehicle Routing Problem (CVRP)
  • 0-1 Knapsack Problem (KP)

Used Libraries

torch==1.2.0
numpy==1.16.4
ipython==7.1.1
matplotlib==3.1.0