Kuifje02/vrpy

reproducibility of results

piaulous opened this issue · 27 comments

Running my CVRP script multiple times with the same input data lead to results, which differ almost all the time.
As I need reproducible results the solution should always look the same.

If this is the initial solution:

1: ['Source', '11', '3', '5', '9', '6', 'Sink'], (III)
2: ['Source', '10', '18', '19', '8', '17', 'Sink'], (II)
3: ['Source', '14', '4', '15', '2', '1', 'Sink'], (II)
4: ['Source', '16', '20', '13', '7', '12', 'Sink'] (I) (III)

These differences occured already:

1: ['Source', '11', '3', '5', '9', '6', 'Sink'],
2: ['Source', '10', '18', '19', '8', '17', 'Sink'],
3: ['Source', '14', '4', '15', '2', '1', 'Sink'],
4: ['Source', '20', '13', '7', '12', '16', 'Sink'] <-- (I) same nodes but different order

1: ['Source', '10', '18', '19', '8', '17', 'Sink'], <-- (II) route numbers (1,2,3,4) differ from initial solution
2: ['Source', '14', '4', '15', '2', '1', 'Sink'], <-- (II) route numbers (1,2,3,4) differ from initial solution
3: ['Source', '11', '12', '6', '5', '9', 'Sink'], (IV) <-- (III) different nodes, different routing
4: ['Source', '16', '20', '13', '7', '3', 'Sink'] (IV) <-- (III) different nodes, different routing

1: ['Source', '10', '18', '19', '8', '17', 'Sink'],
2: ['Source', '14', '4', '15', '2', '1', 'Sink'],
3: ['Source', '3', '7', '13', '20', '16', 'Sink'], <-- (IV) reversed order
4: ['Source', '9', '5', '6', '12', '11', 'Sink <-- (IV) reversed order

I think we need to distinguish two cases :

1/ the solution value differs from one run to another : this is a real problem.

vs

2/ The solution value is the same each time, but the routes are different. This behavior can be expected with linear solvers.

If we are dealing with case 2/ I don't see how we can cope with this as it is related to the solver. The solver considers that each of these configurations are optimal (which is true).

Usually, when we say we need reproducible results, we refer to the solution value.

By solution value you mean the total costs?

I see your point. I was thinking about using the preassignment argument in order to shift the routing in a reproducible direction...

Yes indeed, solution value = total costs.

Using preassignments is a great idea, it is used to lock routes indeed.

Closing for now. Please re open if necessary.

a way to achieve reproducible routes is to return just the Clark & Wright solution prob.solve(heuristic_only=True), which is deterministic

I kindly ask for a reopening.

Just noticed, that my solution values with prob.solve() indeed differ. I have a network of 21 nodes and weighted edges (length). Solution value is the total length. All nodes have a demand of 1.

I tried out several value constellations of load_capacity and prob.num_vehicles (all vehicles are used). For every constellation I made more than 5 runs and ended up with different solution values.
Another thing I noticed was, that load_capacity = 5 and prob.num_vehicles = 5 lead to more stable solution values from run to run, than (4,6) or (6,4).

a way to achieve reproducible routes is to return just the Clark & Wright solution prob.solve(heuristic_only=True), which is deterministic

Yes, but the quality of the solution will not be as good.

I tried out several value constellations of load_capacity and prob.num_vehicles (all vehicles are used). For every constellation I made more than 5 runs and ended up with different solution values.
Another thing I noticed was, that load_capacity = 5 and prob.num_vehicles = 5 lead to more stable solution values from run to run, than (4,6) or (6,4).

Could you please add a minimal working example so that we can reproduce this ?

I've obtained solution values 21314, 21454, 21345, 21315, 21395 from this code:

import networkx as nx
from networkx import DiGraph, from_numpy_matrix, relabel_nodes, set_node_attributes
from numpy import array

# Distance matrix
DISTANCES = [
 [0, 2095.7400000000002, 1691.865516829233, 862.6229483760624, 1601.5608041588769, 1249.374, 1578.2344525273484, 1668.0624056689408, 1110.684, 1388.5990193830364, 841.3631824321101, 492.1374650840197, 1184.5397511638025, 1862.9229137819339, 1122.5363001867227, 1687.79201970507, 1429.297, 1352.9250000000002, 1265.5129469050848, 931.262209888035, 1446.9970000000003, 0], 
 [0, 0, 1569.7435168292332, 2372.6569483760627, 2346.338804158877, 2759.408000000001, 3088.268452527349, 3178.0964056689418, 2241.4875976812646, 2898.633019383037, 1285.57518243211, 2356.3494650840207, 2954.9737511638027, 3633.6599137819335, 2016.7367195183472, 2267.6773484827677, 3200.034, 2829.574597681264, 1046.940946905085, 1865.058807569299, 3253.6065976812642, 2095.7400000000007], 
 [0, 1569.7435168292332, 0, 1968.782465205296, 1561.8476358124115, 2209.7228316535343, 2612.5582841808828, 2774.2219224981754, 1837.6131145104976, 2348.9478510365707, 881.7006992613432, 1952.4749819132528, 2551.0992679930364, 3229.785430611167, 1232.245551171882, 697.9338316535344, 2796.159516829234, 2425.7001145104973, 1305.850463734318, 1461.1843243985325, 2849.732114510497, 1691.8655168292332], 
 [0, 2372.656948376062, 1968.7824652052955, 0, 1367.8317999868807, 547.8209958280038, 737.6974483553521, 805.4394572928785, 1516.7959483760626, 687.0460152110402, 1118.2801308081723, 817.3944134600821, 610.7397469918063, 1293.480909609937, 1027.0752485627852, 1592.3309680811326, 861.4079958280038, 1360.0159483760626, 1542.4298952811473, 1407.6951582640975, 1215.9829958280038, 862.6229483760626], 
 [0, 2346.3388041588764, 1561.8476358124115, 1367.8317999868807, 0, 1165.657804158877, 1568.4932566862253, 1755.6072656237518, 2480.716804158877, 1304.8828235419135, 1634.306986590987, 1829.4652692428965, 1813.235206949606, 2545.8189565363064, 550.4835236772244, 894.8578041588769, 2132.8538041588768, 2373.2668041588777, 2058.456751063962, 2213.790611728176, 2457.842804158877, 1601.5608041588769], 
 [0, 2759.4080000000004, 2209.7228316535347, 547.8209958280038, 1165.657804158877, 0, 626.8304525273484, 813.9444614648747, 2064.6169442040664, 139.22501938303637, 1505.0311824321104, 1365.2154092880858, 871.5724027907286, 1604.1561523774294, 1054.1913001867229, 1542.7330000000002, 1312.8429999999998, 1907.8369442040662, 1929.1809469050854, 1949.108209888035, 1653.9621539545312, 1249.374], 
 [0, 3088.2684525273485, 2612.558284180883, 737.6974483553521, 1568.4932566862253, 626.8304525273484, 0, 613.4389139922231, 2185.2496064818797, 766.0554719103848, 1833.8916349594585, 1488.756917611368, 671.0668553180769, 1403.650604904778, 1457.026752714071, 1945.5684525273482, 1163.3506064818794, 2028.4696064818795, 2258.0413994324335, 2079.057662415383, 1453.4566064818796, 1578.2344525273481], 
 [0, 3178.0964056689413, 2774.2219224981745, 805.4394572928785, 1755.6072656237518, 813.9444614648747, 613.4389139922231, 0, 2032.8699310851523, 953.1694808479111, 1923.7195881010512, 1518.3283961691718, 518.6871799213496, 790.2116909125546, 1644.1407616515976, 2132.682461464875, 1010.9709310851521, 1876.0899310851523, 2347.8693525740264, 1940.252140973187, 1023.831777130621, 1668.062405668941], 
 [0, 2241.487597681264, 1837.6131145104973, 1516.7959483760624, 2480.716804158878, 2064.616944204066, 2185.2496064818797, 2032.869931085152, 0, 2203.8419635871023, 955.9124152491543, 1103.7514650840199, 1514.1827511638023, 1836.8009137819336, 2001.6923001867235, 2535.5469461640314, 1386.884, 608.554, 1411.2605445863492, 656.516209888035, 1202.097, 1110.6840000000002], 
 [0, 2898.6330193830363, 2348.9478510365707, 687.0460152110401, 1304.8828235419132, 139.22501938303637, 766.0554719103848, 953.1694808479111, 2203.841963587103, 0, 1644.2562018151468, 1504.4404286711222, 1010.797422173765, 1743.3811717604658, 1193.4163195697593, 1681.9580193830363, 1452.0680193830362, 2047.0619635871026, 2068.405966288122, 2088.333229271071, 1793.1871733375674, 1388.5990193830364], 
 [0, 1285.57518243211, 881.7006992613432, 1118.2801308081725, 1634.3069865909868, 1505.03118243211, 1833.8916349594583, 1923.719588101051, 955.9124152491543, 1644.2562018151464, 0, 1101.9726475161299, 1700.5969335959126, 2379.2830962140433, 1304.7049019504573, 1579.6345309148776, 1945.65718243211, 1543.9994152491545, 455.3481293371951, 579.4836251371892, 1968.0314152491546, 841.3631824321101], 
 [0, 2356.349465084019, 1952.4749819132526, 817.3944134600821, 1829.4652692428965, 1365.2154092880858, 1488.756917611368, 1518.3283961691718, 1103.7514650840196, 1504.4404286711222, 1101.9726475161294, 0, 999.6412162478221, 1678.327378865953, 1350.4407652707423, 1915.6964847890897, 1244.7014650840197, 1268.4004650840197, 1526.1224119891044, 924.3296749720547, 1362.4724650840199, 492.13746508401965], 
 [0, 2954.9737511638023, 2551.0992679930364, 610.7397469918062, 1813.2352069496058, 871.5724027907286, 671.0668553180769, 518.6871799213496, 1514.1827511638025, 1010.797422173765, 1700.5969335959126, 999.6412162478222, 0, 855.558664945736, 1604.917051350525, 2170.1727708688722, 492.28375116380255, 1357.4027511638026, 2124.746698068888, 1421.5649610518374, 782.3897511638024, 1184.5397511638023], 
 [0, 3633.6599137819335, 3229.7854306111667, 1293.4809096099373, 2545.818956536307, 1604.1561523774294, 1403.6506049047775, 790.2116909125546, 1836.8009137819333, 1743.3811717604658, 2379.2830962140433, 1678.3273788659533, 855.5586649457362, 0, 2287.6582139686557, 2852.9139334870033, 1157.7129137819334, 1680.0209137819334, 2803.4328606870185, 2089.9161236699683, 634.7039137819336, 1862.9229137819334], 
 [0, 2016.7367195183474, 1232.245551171882, 1027.0752485627852, 550.4835236772244, 1054.1913001867226, 1457.0267527140713, 1644.1407616515976, 2001.692300186723, 1193.416319569759, 1304.7049019504573, 1350.4407652707423, 1604.917051350525, 2287.658213968656, 0, 565.2557195183474, 1855.5853001867226, 1894.2423001867228, 1728.8546664234323, 1822.2705100747578, 1978.8183001867228, 1122.5363001867229], 
 [0, 2267.6773484827677, 697.9338316535345, 1592.3309680811326, 894.857804158877, 1542.733, 1945.5684525273482, 2132.682461464875, 2535.5469461640328, 1681.9580193830363, 1579.6345309148778, 1915.6964847890897, 2170.1727708688727, 2852.9139334870033, 565.2557195183474, 0, 2420.84101970507, 2459.498019705071, 2003.784295387853, 2159.1181560520668, 2544.07401970507, 1687.79201970507], 
 [0, 3200.0340000000006, 2796.159516829234, 861.4079958280038, 2132.853804158877, 1312.843, 1163.3506064818794, 1010.9709310851523, 1386.8840000000002, 1452.0680193830365, 1945.6571824321104, 1244.7014650840194, 492.28375116380255, 1157.7129137819338, 1855.5853001867226, 2420.84101970507, 0, 1230.1040000000003, 2369.8069469050856, 1656.2902098880352, 718.8679999999999, 1429.297], 
 [0, 2829.5745976812636, 2425.7001145104978, 1360.0159483760622, 2373.2668041588777, 1907.836944204066, 2028.4696064818795, 1876.0899310851519, 608.554, 2047.0619635871024, 1543.999415249154, 1268.4004650840195, 1357.4027511638023, 1680.0209137819338, 1894.2423001867232, 2459.49801970507, 1230.104, 0, 1999.3475445863492, 1244.6032098880348, 1045.317, 1352.9249999999997], 
 [0, 1046.940946905085, 1305.850463734318, 1542.4298952811473, 2058.456751063962, 1929.1809469050852, 2258.0413994324335, 2347.869352574026, 1411.2605445863494, 2068.4059662881214, 455.348129337195, 1526.1224119891042, 2124.746698068888, 2803.4328606870185, 1728.8546664234325, 2003.7842953878524, 2369.806946905085, 1999.3475445863496, 0, 1034.8317544743843, 2423.379544586349, 1265.5129469050846], 
 [0, 1865.0588075692995, 1461.1843243985327, 1407.6951582640977, 2213.7906117281764, 1949.1082098880354, 2079.057662415383, 1940.2521409731876, 656.516209888035, 2088.3332292710716, 579.4836251371893, 924.3296749720548, 1421.5649610518378, 2089.9161236699692, 1822.2705100747585, 2159.118156052067, 1656.2902098880354, 1244.603209888035, 1034.8317544743845, 0, 1673.9902098880352, 931.2622098880352], 
 [0, 3253.606597681264, 2849.7321145104975, 1215.9829958280036, 2457.8428041588772, 1653.9621539545315, 1453.4566064818796, 1023.8317771306209, 1202.097, 1793.1871733375679, 1968.031415249154, 1362.4724650840196, 782.3897511638025, 634.7039137819335, 1978.8183001867228, 2544.07401970507, 718.8679999999999, 1045.317, 2423.379544586349, 1673.990209888035, 0, 1446.9969999999998], 
 [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
] 

# Demands (key: node, value: amount)
DEMAND = {1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1, 10: 1, 11: 1, 12: 1, 13: 1, 14: 1, 15: 1, 16: 1, 17: 1, 18: 1, 19: 1, 20: 1}

# The matrix is transformed into a DiGraph
A = array(DISTANCES, dtype=[("cost", int)])
G = from_numpy_matrix(A, create_using=nx.DiGraph())

# The demands are stored as node attributes
set_node_attributes(G, values=DEMAND, name="demand")

# The depot is relabeled as Source and Sink
G = relabel_nodes(G, {0: "Source", 21: "Sink"})

from vrpy import VehicleRoutingProblem
prob = VehicleRoutingProblem(G, load_capacity = 5)
prob.num_vehicles = 5
prob.use_all_vehicles = True
prob.solve()
print(prob.best_value)

Hey @piaulous!
Just a had a quick look at this. I think it's when using use_all_vehicles (related to #93)
I think the dual for the upper_bound_vehicles constraint in the master should be +ve (or unconstrained for =).

With this change the solution is consistent and best value is 22749 (both using the LP and cspy).

@torressa I am not sure I understand. It seems strange to me that we have to manually change the sign. Maybe I am missing something !?

@Kuifje02 Maybe I'm confused too. Here's my reasoning anyway.

I did it because of the sign of the constraints, see e.g. here.
The dual from the set covering constraints are >= (or =) is also >= (unconstrained), so for the reduced cost, we have (from the vrpy docs)

where denotes the dual variable for the set covering constraint for node .

In the first case, the vehicle bound constraints are <= so the sign of the dual variables are also <=, so the reduced cost becomes

where is the dual variable for the vehicle bound constraint for vehicle type .

In the case when the vehicle bound constraints are = (if use_all_vehicles) the dual variables are also unconstrained, similarly to the = case for the set covering constraint, the sign in the reduced cost changes also

@torressa I am not 100% sure, but here is how I understand things:

There are two signs:

  • the sign in front of the dual variable, which comes from the sign of the coefficient in the corresponding constraint,
  • the sign of the dual variable (the range of the dual variable), which comes from the nature of the constraint (<=,==,>=).

I don't think either of those signs should ever be tweaked. They are a consequence of the initial equations.

Changing the sign in front of the dual variable means that the sign of the coefficient in the constraint has changed (and not the sign of the constraint), which is not true.

I will re think it through though ! definitely needs some double checking :)

Cool, yeh I think you are right.
There's definitely something a weird going on (with the version on dev) try running the different tests using the use_all_vehicles option (the unit test and the test for this issue) with different solvers.
For me, gurobi and the default solver, consistently give different best values and routes (or gurobi fails with infeasible subproblem).

The subproblem with exact=True is 100% deterministic, right ?

Should be, but just to be sure, do pricing_heuristic="Exact" as well.

With the current dev version, I get consistent results :

  • last LP : 21500, final MIP : 22045 with CPLEX
  • last LP : 21500, 22042 with CBC

(I do not have access to GUROBI at the moment).

I believe it is ok to have different results with different solvers. Indeed, they may be computing slightly different reduced costs , and so a different pool of routes, which in the end yields a difference in the final MIP. On the other hand, I think it is reassuring that both solvers return the same value for the last LP (21500).

I had some problems with CPLEX, as the way we forced the barrier algorithm without crossover was deprecated, but it is fixed now. PULP was returning an infeasible status, while it was just an issue with the options. I checked for GUROBI and the options still look valid (https://www.gurobi.com/documentation/9.1/refman/method.html, https://www.gurobi.com/documentation/9.1/refman/crossover.html). But a sanity check would be to comment out the gurobi options when you encounter an infeasible problem, to check if the default options (simplex and not barrier) work.

I've had some weirder results (working with version in dev as well 305c050). Will double check tomorrow.
LP is always 21500, but the final MIP is varies quite alot, both between repetitions (solving it 10 times) and with Gurobi/Cbc.

Results using cspy=True solving 10 times

  • Last LP : 21500, final MIP : 22112, 22550, 22119 with Gurobi
  • Last LP : 21500, final MIP : 22112, 22433, 22119, 22289, 22550, 22042 with CBC

Results using cspy=False solving 10 times

  • Last LP : 21500, final MIP : 22112, 22258, 22550, 22119 with Gurobi
  • Last LP : 21500, final MIP : 22112, 22042, 22550, 22119 with CBC

Could we do a quick test ? Using cspy=True, here are the two .lp files that I get for the final MIP:

  • master_cbc.lp is the final MIP when I use CBC (22042)
  • master_cplex is the final MIP when I use CPLEX (22045)

If I load master_cbc.lp and solve it with CPLEX, it does return 22042. Could you check that GUROBI returns the same solutions as well ?

If you want you can also post your lp files (activate line 49):

def solve(self, relax, time_limit):
self._solve(relax, time_limit)
logger.debug("master problem relax %s" % relax)
logger.debug("Status: %s" % pulp.LpStatus[self.prob.status])
logger.debug("Objective: %s" % pulp.value(self.prob.objective))
# self.prob.writeLP("master.lp")

and I can check that results are consistent.

If results are consistent, then it probably means that the differences come from the column generation.

I can confirm, both solutions (22042 and 22045) match using Gurobi.

Here are the lp_files.zip.
The filenames have the solution values found by the solvers. I have also verified them all using the gurobi cmd line.

I've installed CPLEX as well, and it seems a lot more stable around 22119 (still some fluctuations but not as much) than Cbc/Gurobi.

@torressa you are a powerful man, having access to all those solvers :D

So at least this eliminates the hypothesis that the differences come from the way the last MIP is solved. More likely, something in the column generation is not 100% deterministic. My guess is that the barrier algorithm without crossover is not 100% deterministic, and returns different columns, which have the same reduced cost, but which are different, yielding different results in the last MIP. I will investigate and see if I can confirm this. If this is the case, we could give the user perhaps more control over the solver parameters, with some disclaimers saying that some option may be faster/slower and possibly not 100% deterministic.

Haha academia FTW

Just going to push a test that solves using all solvers including OR-Tools (copied off your other repo).

Interestingly, the 22550 solution using VRPy gives the following routes

        1: ['Source', 10, 1, 18, 'Sink'],
        2: ['Source', 19, 8, 17, 'Sink'],
        3: ['Source', 2, 15, 4, 14, 'Sink'],
        4: ['Source', 11, 3, 5, 9, 6, 'Sink'],
        5: ['Source', 12, 7, 13, 20, 16, 'Sink']

Routes using OR-Tools (value 14379)

        '0','10','18','1', '0'
        '0','19','8','17', '0'
        '0','14','4','15','2', '0'
        '0','11','3','5','9','6', '0'
        '0','12','16','20','13','7', '0'

So at least this eliminates the hypothesis that the differences come from the way the last MIP is solved. More likely, something in the column generation is not 100% deterministic. My guess is that the barrier algorithm without crossover is not 100% deterministic, and returns different columns, which have the same reduced cost, but which are different, yielding different results in the last MIP. I will investigate and see if I can confirm this. If this is the case, we could give the user perhaps more control over the solver parameters, with some disclaimers saying that some option may be faster/slower and possibly not 100% deterministic.

That was what I thought, but disabling barrier everywhere isn't any better.
It's also really weird that solutions vary so much between machines

I am quite confused ! +1 for using or-tools for another reference. VRPy seems pretty far off.

Seems that use_all_vehicles is still the problem. Or did you guys achieve something in here? (:

I just realised the or-tools code was wrong. Actually, I'm finding it hard to enforce the solver to use all vehicles. This is being discussed in this thread google/or-tools#2067 (comment)
With the code I've just pushed 0785508, even after trying to enforce all vehicle being used, I'm getting

tests/test_issue97.py::TestIssue97::test_ortools Objective: 21248
Route for vehicle 0:
'0', 21 Load(0)
Distance of the route: 0m
Load of the route: 0

Route for vehicle 1:
'0','16','20','13','7','12', 21 Load(5)
Distance of the route: 5276.30753577964m
Load of the route: 5

Route for vehicle 2:
'0','14','4','15','2','10', 21 Load(5)
Distance of the route: 4988.875341369811m
Load of the route: 5

Route for vehicle 3:
'0','1','18','19','8','17', 21 Load(5)
Distance of the route: 6795.507911267503m
Load of the route: 5

Route for vehicle 4:
'0','9','5','6','3','11', 21 Load(5)
Distance of the route: 4201.883818192875m
Load of the route: 5

Total distance of all routes: 21262.57460660983m
Total load of all routes: 20
21262.57460660983

I think, we're actually OK, the behaviour varies with the solver as mentioned before (#97 (comment)), and as already mentioned in the disclaimer in the docs: we do not guarantee the optimal solution.
I think we can close this now.

@torressa thanks for looking into this one. I also think we can close this for now. What should we do with the amazingly named branch ?

Haha I think we can delete it, there's only that test which is pretty slow and requires ortools, so wouldn't want it in the standard build