Problem with the Beam_width config
DRMALEK opened this issue · 8 comments
Hi,
After I changed the beam width to 3 due to the accuracy being low on the Funcom dataset, İ started to have the following error
File "code2seq.py", line 29, in <module> model.train() File "/data/malekbaba_data/codes/code2seq_modified/modelrunner.py", line 167, in train results, precision, recall, f1, rouge = self.evaluate(release_test=False) File "/data/malekbaba_data/codes/code2seq_modified/modelrunner.py", line 238, in evaluate outputs, final_states = self.model.run_decoder(batched_contexts, input_tensors, is_training=False) File "/data/malekbaba_data/codes/code2seq_modified/model.py", line 134, in run_decoder is_training=is_training) File "/data/malekbaba_data/codes/code2seq_modified/model.py", line 206, in decode_outputs contexts_sum = tf.reduce_sum(batched_contexts * tf.expand_dims(valid_mask, -1), File "/data/malekbaba_data/codes/code2seq_modified/venv/lib64/python3.6/site-packages/tensorflow_core/python/ops/math_ops.py", line 902, in binary_op_wrapp$ return func(x, y, name=name) File "/data/malekbaba_data/codes/code2seq_modified/venv/lib64/python3.6/site-packages/tensorflow_core/python/ops/math_ops.py", line 1201, in _mul_dispatch return gen_math_ops.mul(x, y, name=name) File "/data/malekbaba_data/codes/code2seq_modified/venv/lib64/python3.6/site-packages/tensorflow_core/python/ops/gen_math_ops.py", line 6122, in mul _ops.raise_from_not_ok_status(e, name) File "/data/malekbaba_data/codes/code2seq_modified/venv/lib64/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 6606, in raise_from_no$ six.raise_from(core._status_to_exception(e.code, message), None) File "<string>", line 3, in raise_from tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [384,100,512] vs. [128,100,1] [Op:Mul] name: mul/
My config file is as follow:
config.NUM_EPOCHS = 3000
config.SAVE_EVERY_EPOCHS = 1
config.PATIENCE = 10
config.BATCH_SIZE = 128
config.READER_NUM_PARALLEL_BATCHES = 16
config.SHUFFLE_BUFFER_SIZE = 10000
config.CSV_BUFFER_SIZE = 100 * 1024 * 1024 # 100 MB
config.MAX_CONTEXTS = 100
config.SUBTOKENS_VOCAB_MAX_SIZE = 190000
config.TARGET_VOCAB_MAX_SIZE = 27000
config.EMBEDDINGS_SIZE = 128 * 4
config.RNN_SIZE = 128 * 4 # Two LSTMs to embed paths, each of size 128
config.DECODER_SIZE = 512
config.NUM_DECODER_LAYERS = 2
config.MAX_PATH_LENGTH = 8 + 1
config.MAX_NAME_PARTS = 5 # Maximun subtoken in a token
config.MAX_TARGET_PARTS = 30 # Maximum num of words in a comment (30 in deepcom paper)
config.EMBEDDINGS_DROPOUT_KEEP_PROB = 0.75
config.RNN_DROPOUT_KEEP_PROB = 0.5
config.BIRNN = True
config.RANDOM_CONTEXTS = True
config.BEAM_WIDTH = 3
config.USE_MOMENTUM = False
Hi @DRMALEK ,
Does this config work when you use our datasets and models, without modifications?
Uri
Actually, I trained your model on the Deepcom dataset and everything goes fine (but without beam_width change), so you are right I should retrain it first with the beath_width change there, then try to train it on the Funcom dataset.
Sorry for my beginner kind of mistakes!
As far as I understood the beam width config is used in the training phase, however not in the original implementation https://github.com/tech-srl/code2seq/, but in the https://github.com/Kolkir/code2seq implementation. Unfortunately, the issues section is not activated there 😞
https://github.com/Kolkir/code2seq this implementaion
No problem, I will try to contact the repo author and ask them if they can open the issue section in their repo, and in the main time, I will use your implementation.
Thanks for your understanding.