thunlp/TensorFlow-Summarization

tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,3] = 37849 is not in [0, 30000)

Closed this issue · 6 comments

i don't understand why I am getting this error

here is the complete summary of errors

Oct 04 14:15 saver.py[line:1455] INFO Restoring parameters from model/model.ckpt-300000
Traceback (most recent call last):
File "C:\Python3\lib\site-packages\tensorflow\python\client\session.py", line 1039, in _do_call
return fn(*args)
File "C:\Python3\lib\site-packages\tensorflow\python\client\session.py", line 1021, in _run_fn
status, run_metadata)
File "C:\Python3\lib\contextlib.py", line 66, in exit
next(self.gen)
File "C:\Python3\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,23] = 56047 is not in [0, 30000)
[[Node: seq2seq/encoder/embedding_lookup = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@seq2seq/encoder/embedding"], validate_indices=true, _device="/job:localhost/replica:0/task:0/cpu:0"](seq2seq/encoder/embedding/read, _recv_Placeholder_0)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "src/summarization.py", line 241, in
tf.app.run()
File "C:\Python3\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "src/summarization.py", line 229, in main
decode()
File "src/summarization.py", line 214, in decode
sess, encoder_inputs, encoder_len, geneos=FLAGS.geneos)
File "C:\Users\vinayp\Desktop\BEP\textsum\src\bigru_model.py", line 227, in step_beam
outputs = session.run(output_feed, input_feed)
File "C:\Python3\lib\site-packages\tensorflow\python\client\session.py", line 778, in run
run_metadata_ptr)
File "C:\Python3\lib\site-packages\tensorflow\python\client\session.py", line 982, in _run
feed_dict_string, options, run_metadata)
File "C:\Python3\lib\site-packages\tensorflow\python\client\session.py", line 1032, in _do_run
target_list, options, run_metadata)
File "C:\Python3\lib\site-packages\tensorflow\python\client\session.py", line 1052, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,23] = 56047 is not in [0, 30000)
[[Node: seq2seq/encoder/embedding_lookup = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@seq2seq/encoder/embedding"], validate_indices=true, _device="/job:localhost/replica:0/task:0/cpu:0"](seq2seq/encoder/embedding/read, _recv_Placeholder_0)]]

Caused by op 'seq2seq/encoder/embedding_lookup', defined at:
File "src/summarization.py", line 241, in
tf.app.run()
File "C:\Python3\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "src/summarization.py", line 229, in main
decode()
File "src/summarization.py", line 196, in decode
model = create_model(sess, True)
File "src/summarization.py", line 75, in create_model
dtype=dtype)
File "C:\Users\vinayp\Desktop\BEP\textsum\src\bigru_model.py", line 67, in init
encoder_emb, self.encoder_inputs)
File "C:\Python3\lib\site-packages\tensorflow\python\ops\embedding_ops.py", line 119, in embedding_lookup
params[0], ids, validate_indices=validate_indices, name=name))
File "C:\Python3\lib\site-packages\tensorflow\python\ops\embedding_ops.py", line 41, in _do_gather
params, ids, name=name, validate_indices=validate_indices)
File "C:\Python3\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 1207, in gather
validate_indices=validate_indices, name=name)
File "C:\Python3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 768, in apply_op
op_def=op_def)
File "C:\Python3\lib\site-packages\tensorflow\python\framework\ops.py", line 2336, in create_op
original_op=self._default_original_op, op_def=op_def)
File "C:\Python3\lib\site-packages\tensorflow\python\framework\ops.py", line 1228, in init
self._traceback = _extract_stack()

also i wanted to ask you how much time did it take to train on titanX gpu

tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,23] = 56047 is not in [0, 30000)

I think, basically, the word exceeds the vocabulary. Please make sure that 'doc_vocab_size' and 'sum_vocab_size' are 30000 if you use pre-trained models. The dict file has more words but only first 30000 words are used in the model.

The training time is about 24hr for 300,000 mini-batches. (I don't remember accurately.)

yes they are but even then i am getting this error

Okay i checked it on github it seems like you have to manually set the parameter but can you tell me how will i be able to test it with your pre-trained models?

@vinayhpandya have you resolved this issue ? I get the same error.

I am facing this issue in prediction, how to ensure that words are there is we use pre-trained model?

I am facing this issue in trainning when I run the following:

model.fit_generator(generator = generate_batch(X_train, y_train, batch_size = batch_size),
steps_per_epoch = train_samples//batch_size,
epochs=epochs,
validation_data = generate_batch(X_test, y_test, batch_size = batch_size),
validation_steps = val_samples//batch_size)