jiegzhan/multi-class-text-classification-cnn-rnn

train.py fails with best_model.ckpt not found

hilmij opened this issue · 5 comments

Hi,
I am trying the example with Python 3.6.1 and TensorFlow 1.2.1 on Windows 10.

I am getting the following error when I run "python train.py ./data/train.csv.zip ./training_config.json".

CRITICAL:root:Saved model ./checkpoints_1501717661/model-2700 at step 2700
CRITICAL:root:Best accuracy 0.997291996203733 at step 2700
CRITICAL:root:Training is complete, testing the best model on x_test and y_test
INFO:tensorflow:Restoring parameters from ./checkpoints_1501717661/model-2700
INFO:tensorflow:Restoring parameters from ./checkpoints_1501717661/model-2700
CRITICAL:root:Accuracy on test set: 0.9972894482090997
Traceback (most recent call last):
File "train.py", line 161, in
train_cnn_rnn()
File "train.py", line 151, in train_cnn_rnn
os.rename(path, trained_dir + 'best_model.ckpt')
FileNotFoundError: [WinError 2] The system cannot find the file specified: './checkpoints_1501717661/model-2700' -> './trained_results_1501717661/best_model.ckpt'

I ran the train.py couple of times now. Same error. Please help me to solve this issue.

Thanks,
Hilmi.

Thanks for the reply.
I did not change any files. Just trying to run as it is.

I tested the same with Ubuntu (Python 3.5.2, TensorFlow 1.2.1). Same result.

Please let me know if you need more info to figure this issue.

INFO:tensorflow:Restoring parameters from ./checkpoints_1501820994/model-2500
INFO:tensorflow:Restoring parameters from ./checkpoints_1501820994/model-2500
CRITICAL:root:Accuracy on test set: 0.9972780593360288
Traceback (most recent call last):
File "train.py", line 161, in
train_cnn_rnn()
File "train.py", line 151, in train_cnn_rnn
os.rename(path, trained_dir + 'best_model.ckpt')
FileNotFoundError: [Errno 2] No such file or directory: './checkpoints_1501820994/model-2500' -> './trained_results_1501820994/best_model.ckpt'

https://github.com/jiegzhan/multi-class-text-classification-cnn-rnn/blob/master/train.py#L141

comment all the code after line 141, take a look at the checkpoint directory.

Then change the code after line 141 accordingly. Essentially you wanna save the best model to a file for prediction in future.

fix line 151:
from: os.rename(path, trained_dir + 'best_model.ckpt')
to: os.rename(path + '.ckpt', trained_dir + 'best_model.ckpt')