Code is written as a solution for Statistical Techniques for Data Science Assignment #3, Innopolis University, S21.
- Sample initial
$x_0$ , set time step t = 0; - Set initial temperature T. To avoid problems with numerical overflow when calculating the exponent, set the temperature comparable with the initial value of the system’s energy;
- Generate
$x′$ from$g(x′|x_t)$ . For continuous problems, the common solution is to use the normal distribution. For salesman problem, to generate new x' from distribution of paths: 1) pick two cities in the path → 2) exchange their positions in the path → 3) return the new proposed path; - Calculate acceptance ratio
$α = \frac{p∗(x')}{p∗(x_t)}$ , where p*(x) =$e^{\frac{-distance(path)}{T}}$ and where$path$ is the ordered list of cities to visit,$distance$ is the function that computes path distance; - Generate
$u ∼ U(0,1)$ . If$u ≤ α$ , accept the new state$x_{t+1} = x′$ , otherwise propagate the old state; - Reduce temperature T. Temperature annealing does not have to occur on every iteration. The temperature can be decreased every N iteration as well. There is no strict rule about this;
- Increment t;
- Repeat 2-8 until cooled down. The solution can be sampled before the system is cooled down, but we don't know whether this was the optimal solution.
For algorithm testing, there were chosen 30 most populated Russian cities. Information about cities was extracted from Wikipedia - source.
In the attached notebook, you can find simple data preprocessing, helper functions, algorithm implementation, experiments, and code for animation.
The best path distance achieved through experimentation was 17-20k km.
Parameters,
Visualization of convergence for different annealing speeds: