tech-srl/code2seq

Problem with the Beam_width config

DRMALEK opened this issue · 8 comments

Hi,

After I changed the beam width to 3 due to the accuracy being low on the Funcom dataset, İ started to have the following error

File "code2seq.py", line 29, in <module> model.train() File "/data/malekbaba_data/codes/code2seq_modified/modelrunner.py", line 167, in train results, precision, recall, f1, rouge = self.evaluate(release_test=False) File "/data/malekbaba_data/codes/code2seq_modified/modelrunner.py", line 238, in evaluate outputs, final_states = self.model.run_decoder(batched_contexts, input_tensors, is_training=False) File "/data/malekbaba_data/codes/code2seq_modified/model.py", line 134, in run_decoder is_training=is_training) File "/data/malekbaba_data/codes/code2seq_modified/model.py", line 206, in decode_outputs contexts_sum = tf.reduce_sum(batched_contexts * tf.expand_dims(valid_mask, -1), File "/data/malekbaba_data/codes/code2seq_modified/venv/lib64/python3.6/site-packages/tensorflow_core/python/ops/math_ops.py", line 902, in binary_op_wrapp$ return func(x, y, name=name) File "/data/malekbaba_data/codes/code2seq_modified/venv/lib64/python3.6/site-packages/tensorflow_core/python/ops/math_ops.py", line 1201, in _mul_dispatch return gen_math_ops.mul(x, y, name=name) File "/data/malekbaba_data/codes/code2seq_modified/venv/lib64/python3.6/site-packages/tensorflow_core/python/ops/gen_math_ops.py", line 6122, in mul _ops.raise_from_not_ok_status(e, name) File "/data/malekbaba_data/codes/code2seq_modified/venv/lib64/python3.6/site-packages/tensorflow_core/python/framework/ops.py", line 6606, in raise_from_no$ six.raise_from(core._status_to_exception(e.code, message), None) File "<string>", line 3, in raise_from tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [384,100,512] vs. [128,100,1] [Op:Mul] name: mul/


My config file is as follow:

    config.NUM_EPOCHS = 3000 
    config.SAVE_EVERY_EPOCHS = 1 
    config.PATIENCE = 10
    config.BATCH_SIZE = 128
    config.READER_NUM_PARALLEL_BATCHES = 16
    config.SHUFFLE_BUFFER_SIZE = 10000
    config.CSV_BUFFER_SIZE = 100 * 1024 * 1024  # 100 MB
    config.MAX_CONTEXTS = 100
    config.SUBTOKENS_VOCAB_MAX_SIZE = 190000
    config.TARGET_VOCAB_MAX_SIZE = 27000
    config.EMBEDDINGS_SIZE = 128 * 4
    config.RNN_SIZE = 128 * 4  # Two LSTMs to embed paths, each of size 128
    config.DECODER_SIZE = 512
    config.NUM_DECODER_LAYERS = 2
    config.MAX_PATH_LENGTH = 8 + 1
    config.MAX_NAME_PARTS = 5             # Maximun subtoken in a token
    config.MAX_TARGET_PARTS = 30        # Maximum num of words in a comment (30 in deepcom paper)
    config.EMBEDDINGS_DROPOUT_KEEP_PROB = 0.75
    config.RNN_DROPOUT_KEEP_PROB = 0.5
    config.BIRNN = True
    config.RANDOM_CONTEXTS = True
    config.BEAM_WIDTH = 3
    config.USE_MOMENTUM = False 

Hi @DRMALEK ,
Does this config work when you use our datasets and models, without modifications?

Uri

Actually, I trained your model on the Deepcom dataset and everything goes fine (but without beam_width change), so you are right I should retrain it first with the beath_width change there, then try to train it on the Funcom dataset.
Sorry for my beginner kind of mistakes!

As far as I understood the beam width config is used in the training phase, however not in the original implementation https://github.com/tech-srl/code2seq/, but in the https://github.com/Kolkir/code2seq implementation. Unfortunately, the issues section is not activated there 😞

No problem, I will try to contact the repo author and ask them if they can open the issue section in their repo, and in the main time, I will use your implementation.

Thanks for your understanding.