Error running with TF 0.12.1, Python 3.4.3, Ubuntu 14.04

Question

Error running with TF 0.12.1, Python 3.4.3, Ubuntu 14.04

dab3-2014 opened this issue 8 years ago · 5 comments

dab3-2014 commented 8 years ago

Hi...So I installed all the required dependencies w/ pip3, made sure I can import them in Python3 with no issues, and downloaded dataset as described. Now running: $ python3 udc_train.py produces the following error:

InvalidArgumentError (see above for traceback): Incompatible shapes: [80,1] vs. [160,1]
         [[Node: prediction/logistic_loss/mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/gpu:0"](prediction/Squeeze, prediction/ToFloat)]]
         [[Node: recall_at_2/ToInt64/_91 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_217_recall_at_2/ToInt64", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

Any idea why this is happening? Any idea how to fix it?

Thanks.

Answer 1 · 2017-02-03T07:15:03.000Z

I have same error running with TF 0.12.1, python 2.7, ubuntu 16.04, gtx 1070, cuda 8.0, cudnn 5.1.5. So I ran it on AWS (g2.2xlarge) and the same problem occurred. It is like an error that occurs in tensorflow. But this issue was already closed. #15

Answer 2 · 2017-02-03T16:54:38.000Z

Commenting out monitors seems to be fixing the issue. Please confirm?

Answer 3 · 2017-03-01T19:41:45.000Z

@pavelromashkin Yes this helped me. Probably because I'm using TF 1.0 where the monitors are deprecated.

Any advice on how to evaluate by reimplementing the streaming in TF 1.0 without monitors?

Answer 4 · 2019-06-09T05:16:31.000Z

我使用的是tensorflow-1.12把udc_hparams.py下面的代码
tf.flags.DEFINE_integer("batch_size", 128, "Batch size during training")
tf.flags.DEFINE_integer("eval_batch_size", 16, "Batch size during evaluation")
改成如下形式就行了
tf.flags.DEFINE_integer("batch_size", 64, "Batch size during training")
tf.flags.DEFINE_integer("eval_batch_size", 8, "Batch size during evaluation")

Answer 5 · 2019-08-01T09:47:48.000Z

I am using tensorflow=1.12.0 on win7 64. i am running into this error during training. thanks.
InvalidArgumentError (see above for traceback): Incompatible shapes: [20,1] vs. [80,1]
[[node prediction/logistic_loss/mul (defined at F:\09.Practice\chatbot-deeplearning-retrieval\models\dual_encoder.py:87) = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](prediction/Squeeze, prediction/ToFloat)]]