question about /maml_rl/policies/categorical_mlp.py

Question

question about /maml_rl/policies/categorical_mlp.py

Rui-Chun opened this issue 5 years ago · 1 comments

def forward(self, input, params=None):

    if params is None:
        params = OrderedDict(self.named_parameters())
    output = input
    for i in range(1, self.num_layers):
        output = F.linear(output,
            weight=params['layer{0}.weight'.format(i)],
            bias=params['layer{0}.bias'.format(i)])
        output = self.nonlinearity(output)
    logits = F.linear(output,
        weight=params['layer{0}.weight'.format(self.num_layers)],
        bias=params['layer{0}.bias'.format(self.num_layers)])
    return Categorical(logits=logits)

At the end of categorical_mlp.py, the forward function return the above result.

But should not it be " return Categorical(logits) ", since logits means the probability, right?

Answer 1 · 2019-06-09T19:06:51.000Z

logits are the values you would apply a softmax over to get the probabilities. They're not quite probabilities themselves (in particular, these values might be negative or don't sum to 1), and that's why we use Categorical(logits=logits). Categorical is responsible for applying that softmax operation.