Inconsistent results

Question

Inconsistent results

JorenCoulier opened this issue 4 years ago · 1 comments

The resulting near-optimal value function is different (much larger than the small convergence threshold) for each execution of the same problem.

Answer 1 · 2021-04-13T17:08:26.000Z

When running some of the available tests, 'test_transition' in 'test_mdp.py' failed with one of following assertion errors:
AssertionError: 0.0 != 0.05 within 7 places (0.05 difference)
AssertionError: 0.9999999999999998 != 0.05 within 7 places (0.9499999999999997 difference)
AssertionError: 0.0 != 1.0 within 7 places (1.0 difference)
It seems to be completely random which of the assertion errors is thrown after each execution of the test.

The test in 'test_value_iteration.py' also fails and the resulting policy that causes the assertion error is also different between multiple executions of the test.