Shared Variables between decoder and encoder in the basic architecture

Question

Shared Variables between decoder and encoder in the basic architecture

Opened this issue 6 years ago · 4 comments

Hi,
I congratulate you for this paper being cited multiple times recently,
I am just confused about the basic architecture written in the code, as it was mentioned in the basic form the variables of encoder and decoder are not shared , it means each decoder and encoder have their own variables, but when I checked in the basic architecture the same GRUcell is used for decoder and encoder, and when I check the number of variables it was as same as the number of variables in "tied" architecture ,
I mean I guess in the basic architecture variables of encoder and decider are not separated??
I would appreciate you if you help me understanding my mistake.

Thanks,

Answer 1 · 2018-07-01T16:57:25.000Z

Hi @smakaviani,

Thanks for pointing that out.

Could you please provide a MWE that shows the problem that you mention?

Cheers,

Answer 2 · 2018-07-02T06:56:17.000Z

Thank you so much for your quick response,
@una-dinosauria When I run basic architecture I got following variables in basic rnn-scope

<class 'list'>: [<tf.Variable 'Variable:0' shape=() dtype=float32_ref>, <tf.Variable 'Variable_1:0' shape=() dtype=int32_ref>, <tf.Variable 'proj_w_out:0' shape=(1024, 69) dtype=float32_ref>, <tf.Variable 'proj_b_out:0' shape=(69,) dtype=float32_ref>, <tf.Variable 'basic_rnn_seq2seq/rnn/gru_cell/gates/kernel:0' shape=(1093, 2048) dtype=float32_ref>, <tf.Variable 'basic_rnn_seq2seq/rnn/gru_cell/gates/bias:0' shape=(2048,) dtype=float32_ref>, <tf.Variable 'basic_rnn_seq2seq/rnn/gru_cell/candidate/kernel:0' shape=(1093, 1024) dtype=float32_ref>, <tf.Variable 'basic_rnn_seq2seq/rnn/gru_cell/candidate/bias:0' shape=(1024,) dtype=float32_ref>]

and when I run tied architecture I got following variables
<class 'list'>: [<tf.Variable 'Variable:0' shape=() dtype=float32_ref>, <tf.Variable 'Variable_1:0' shape=() dtype=int32_ref>, <tf.Variable 'proj_w_out:0' shape=(1024, 69) dtype=float32_ref>, <tf.Variable 'proj_b_out:0' shape=(69,) dtype=float32_ref>, <tf.Variable 'combined_tied_rnn_seq2seq/tied_rnn_seq2seq/gru_cell/gates/kernel:0' shape=(1093, 2048) dtype=float32_ref>, <tf.Variable 'combined_tied_rnn_seq2seq/tied_rnn_seq2seq/gru_cell/gates/bias:0' shape=(2048,) dtype=float32_ref>, <tf.Variable 'combined_tied_rnn_seq2seq/tied_rnn_seq2seq/gru_cell/candidate/kernel:0' shape=(1093, 1024) dtype=float32_ref>, <tf.Variable 'combined_tied_rnn_seq2seq/tied_rnn_seq2seq/gru_cell/candidate/bias:0' shape=(1024,) dtype=float32_ref>]
which are not different from the basic one ,from dimension point of view, in spite of my expectation
I can not figure out the difference between tied and untied architecture from the code some how,

Answer 3 · 2018-07-02T11:55:13.000Z

Is this in tesorboard? What command do you use? Did you change the code?

Please provide some code I can run to reproduce the error.

Answer 4 · 2018-07-02T22:18:42.000Z

No, I have not changed the code, I debug that in pycharm, and after following lines in seq2seq_model.py

   if architecture == "basic":
      # Basic RNN does not have a loop function in its API, so copying here.
      with vs.variable_scope("basic_rnn_seq2seq"):
        _, enc_state = tf.contrib.rnn.static_rnn(cell, enc_in, dtype=tf.float32) # Encoder
        outputs, self.states = tf.contrib.legacy_seq2seq.rnn_decoder( dec_in, enc_state, cell, loop_function=lf ) # Decoder
    elif architecture == "tied":
      outputs, self.states = tf.contrib.legacy_seq2seq.tied_rnn_seq2seq( enc_in, dec_in, cell, loop_function=lf )
    else:
      raise(ValueError, "Uknown architecture: %s" % architecture )

I use the command of tf.global_variables() command to see variables of architecture.