carpedm20/ENAS-pytorch

The embedding/encoder may not be working the way you think it is

philtomson opened this issue · 1 comments

In issues 33 (#33) I wondered why the Embedding encoder was as large as it was. The response from @carpedm20 was:

You can assume the same activation in a different place to have the same semantics (embedding) or not. I assumed it's different because activation in different locations may have a separate role.

That makes sense. However, I've been running through this section of the code (Controller.sample()) in the debugger trying to understand what's going on... when mode is 0 (the case when activation func is being selected) then sum(self.num_tokens[:mode]) is 0. So the line:

inputs = utils.get_variable(
                action[:, 0] + sum(self.num_tokens[:mode]),
                requires_grad=False)

is always just the action[:,0] component which is a value from 0 to 3 (one of 4 activation functions in the activation function list) since sum(self.num_tokens[:0] is 0.

And when mode is 1, the sum(self.num_tokens[:mode]) is always 4 - so not sure how you can get anything higher than len(args.shared_rnn_activations)+self.args.num_blocks here. mode can only take on values of 0 or 1. Either I'm missing something or maybe it's a bug?

with self.args.num_blocks = 6, I see self.num_tokens is: [4, 1, 4, 2, 4, 3, 4, 4, 4, 5, 4, 6, 4] and sum(self.num_tokens) = 49

To actually use all of those 49 entries of the embedding table I suspect that what we need for inputs is something along the lines of:

    inputs = utils.get_variable(
                action[:, 0] + sum(self.num_tokens[:block_idx]) ,
                requires_grad=False)

I agree with you. And I think the original code is a bug. sum(self.num_tokens[:block_idx]) sholde be right.