run_test.sh problem
77281900000 opened this issue · 11 comments
Hi,
I run this demo.py twice. The first time it works well,and its accurancy is 1.But when i delete the model and try again it works but its result is only about 0.8.I'm sure i didn't change the program.I have tried to deleted the whole program and git clone it again. Its result is still about 0.8.Then I run run_test.sh and got a error,
======================================================================
FAIL: test_four_clusters (main.TestIntegration)
Four clusters on vertices of a square.
Traceback (most recent call last):
File "tests/integration_test.py", line 99, in test_four_clusters
self.assertEqual(1.0, accuracy)
AssertionError: 1.0 != 0.9
Ran 1 test in 17.543s
FAILED (failures=1)
There must be something strange happens.Could anyone tell me why could lead to this happen?
Thanks.
Thanks for reporting. We will look into it.
Hi @gaodihe , could you let us know your versions of python, numpy, and pytorch? That could help us identify the problem.
Tried a few times. Cannot replicate the problem yet :(
@gaodihe Thanks for your information!
I just created a new issue: #16
Once this is done, I may need your help to re-run the tests with a high verbosity value and share the loggings with us.
Currently we don't have sufficient information to debug this.
@gaodihe Actually, even before I resolve that bug, could you share with me the full STDOUT information of your failing test?
....
Ran 4 tests in 0.001s
OK
F
FAIL: test_four_clusters (main.TestIntegration)
Four clusters on vertices of a square.
Traceback (most recent call last):
File "tests/integration_test.py", line 99, in test_four_clusters
self.assertEqual(1.0, accuracy)
AssertionError: 1.0 != 0.9
Ran 1 test in 15.784s
FAILED (failures=1)
gaodihe@gaodihe-All-Series:~/PycharmProjects/uis-rnn$ ./run_tests.sh >1.txt
....
Ran 4 tests in 0.001s
OK
^CTraceback (most recent call last):
File "tests/integration_test.py", line 115, in
unittest.main()
File "/usr/lib/python3.5/unittest/main.py", line 94, in init
self.runTests()
File "/usr/lib/python3.5/unittest/main.py", line 255, in runTests
self.result = testRunner.run(self.test)
File "/usr/lib/python3.5/unittest/runner.py", line 176, in run
test(result)
File "/usr/lib/python3.5/unittest/suite.py", line 84, in call
return self.run(*args, **kwds)
File "/usr/lib/python3.5/unittest/suite.py", line 122, in run
test(result)
File "/usr/lib/python3.5/unittest/suite.py", line 84, in call
return self.run(*args, **kwds)
File "/usr/lib/python3.5/unittest/suite.py", line 122, in run
test(result)
File "/usr/lib/python3.5/unittest/case.py", line 648, in call
return self.run(*args, **kwds)
File "/usr/lib/python3.5/unittest/case.py", line 600, in run
testMethod()
File "tests/integration_test.py", line 89, in test_four_clusters
model.fit(train_sequence, train_cluster_id, training_args)
File "/home/gaodihe/PycharmProjects/uis-rnn/model/uisrnn.py", line 250, in fit
mean, _ = self.rnn_model(packed_train_sequence, hidden)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/gaodihe/PycharmProjects/uis-rnn/model/uisrnn.py", line 45, in forward
output_seq, hidden = self.gru(input_seq, hidden)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/rnn.py", line 192, in forward
output, hidden = func(input, self.all_weights, hx, batch_sizes)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/_functions/rnn.py", line 323, in forward
return func(input, *fargs, **fkwargs)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/_functions/rnn.py", line 287, in forward
dropout_ts)
KeyboardInterrupt
gaodihe@gaodihe-All-Series:~/PycharmProjects/uis-rnn$ ./run_tests.sh
~/PycharmProjects/uis-rnn ~/PycharmProjects/uis-rnn
Running tests in tests/utils_test.py
....
Ran 4 tests in 0.001s
OK
Running tests in tests/integration_test.py
Iter: 0 Training Loss: 4.1982
Negative Log Likelihood: 6.4964 Sigma2 Prior: -2.2984 Regularization: 0.0002
Iter: 10 Training Loss: -0.1935
Negative Log Likelihood: 1.5341 Sigma2 Prior: -1.7278 Regularization: 0.0002
Iter: 20 Training Loss: -0.8121
Negative Log Likelihood: 0.7201 Sigma2 Prior: -1.5324 Regularization: 0.0002
Iter: 30 Training Loss: -1.0152
Negative Log Likelihood: 0.4823 Sigma2 Prior: -1.4977 Regularization: 0.0002
Iter: 40 Training Loss: -1.0503
Negative Log Likelihood: 0.4961 Sigma2 Prior: -1.5466 Regularization: 0.0002
Changing learning rate to: 0.005
Iter: 50 Training Loss: -1.3244
Negative Log Likelihood: 0.3421 Sigma2 Prior: -1.6667 Regularization: 0.0002
Iter: 60 Training Loss: -1.5849
Negative Log Likelihood: 0.3386 Sigma2 Prior: -1.9238 Regularization: 0.0002
Iter: 70 Training Loss: -2.2978
Negative Log Likelihood: 1.0629 Sigma2 Prior: -3.3610 Regularization: 0.0002
Iter: 80 Training Loss: -2.6359
Negative Log Likelihood: 0.4483 Sigma2 Prior: -3.0844 Regularization: 0.0002
Iter: 90 Training Loss: -2.3275
Negative Log Likelihood: 0.1928 Sigma2 Prior: -2.5205 Regularization: 0.0002
Changing learning rate to: 0.0025
Iter: 100 Training Loss: -2.8464
Negative Log Likelihood: 0.1297 Sigma2 Prior: -2.9763 Regularization: 0.0003
Iter: 110 Training Loss: -2.5952
Negative Log Likelihood: 0.0849 Sigma2 Prior: -2.6804 Regularization: 0.0003
Iter: 120 Training Loss: -2.6835
Negative Log Likelihood: 0.0827 Sigma2 Prior: -2.7664 Regularization: 0.0003
Iter: 130 Training Loss: -3.3645
Negative Log Likelihood: 0.9887 Sigma2 Prior: -4.3535 Regularization: 0.0003
Iter: 140 Training Loss: -3.6595
Negative Log Likelihood: 0.1904 Sigma2 Prior: -3.8502 Regularization: 0.0003
Changing learning rate to: 0.00125
Iter: 150 Training Loss: -3.9500
Negative Log Likelihood: 0.1933 Sigma2 Prior: -4.1435 Regularization: 0.0003
Iter: 160 Training Loss: -3.5048
Negative Log Likelihood: 0.1096 Sigma2 Prior: -3.6147 Regularization: 0.0003
Iter: 170 Training Loss: -3.2753
Negative Log Likelihood: 0.4618 Sigma2 Prior: -3.7374 Regularization: 0.0003
Iter: 180 Training Loss: -3.5798
Negative Log Likelihood: 0.4441 Sigma2 Prior: -4.0242 Regularization: 0.0003
Iter: 190 Training Loss: -3.4913
Negative Log Likelihood: 0.4049 Sigma2 Prior: -3.8965 Regularization: 0.0003
Done training with 200 iterations
F
FAIL: test_four_clusters (main.TestIntegration)
Four clusters on vertices of a square.
Traceback (most recent call last):
File "tests/integration_test.py", line 99, in test_four_clusters
self.assertEqual(1.0, accuracy)
AssertionError: 1.0 != 0.9
Ran 1 test in 16.022s
My initial guess is that the network simply didn't converge to a good point at the end of training.
0.9 is still a high accuracy, though we were expecting 1.0.
A few things to try to validate this:
- In
integration_test.py
, changetraining_args.train_iteration = 200
to a larger value like 300, to see if you get accuracy = 1.0. - In
integration_test.py
thesetUp()
function, change the values of the random seeds, to see if the accuracy becomes 1.0.
In general this issue could be avoided by training multiple networks in parallel and pick the best one.
But there is also space for us to improve the training process and the default arguments to make the training more robust and efficient.
@AnzCol Please take a look at this to see if you have any thoughts.
Yes,I have tried your reslosutions.Both of them could solve this problem.Thanks for your help.
@gaodihe Thanks for trying it. It's very helpful!
It basically validated that the failure is due to unsuccessful training.
In practice we usually have much more training steps than the unit/integration tests. The purpose of the tests is to validate code correctness, and we often run it after a small code change, so we prefer to use less steps to make it fast instead of stable.