[Problem] training step
Jaehyun0818 opened this issue · 3 comments
Hi,
I followed the instruction to run the training:python train.py using the default settings max_epoch=500
At the beginning of the training step, there is error popping up:
2019-02-02 15:50:29.682610: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX
2019-02-02 15:50:29.907778: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.645
pciBusID: 0000:03:00.0
totalMemory: 10.91GiB freeMemory: 10.37GiB
2019-02-02 15:50:29.907851: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1)
--- Get model and loss
Process Process-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "train.py", line 164, in fill_queues
stack_train.put(p.get())
File "/usr/lib/python2.7/multiprocessing/pool.py", line 567, in get
raise self._value
ValueError: probabilities do not sum to 1
--- Get training operator
2019-02-02 15:50:37.119749: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1195] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:03:00.0, compute capability: 6.1)
('in epoch', 0)
('max_epoch', 500)
**** EPOCH 000 ****
2019-02-02 15:50:41.417430
Progress: [----------] 0.0%
Also, the progress percentage doesn't increase during a day and it keeps 0.0%.
Would anyone please advise on what the error above means and what should I do?
Thanks.
Jaehyun
Hi, Jaehyun. Did you solve the problem?
Hey, did u figure it out? Even I'm facing the same problem.
Yes. In my case, I figured it out by changing the python version to python 3.5