kmkurn/pytorch-crf

Strange Log Likelihood tensor (two elements tensor).

contribcode opened this issue · 5 comments

Using the library, I ran into the following error:

ValueError: only one element tensors can be converted to Python scalars.

The error is thrown because of the loss tensor that the CRF layer returns, which is

tensor([340.2023, 304.0254], device='cuda:0', grad_fn=<NegBackward>). (Since the CRF layer returns the Log Likelihood, I take the negative Log Likelihood by loss = - log_likelihood, where log_likelihood is what the CRF layer returns.

In the documentation, the loss (or the log likelihood) is an one element tensor.

I define my model like this (to define the model I use the HuggingFace library):

class Bert_Clf(nn.Module):

    def __init__(self, config_arg):
        super(Bert_Clf, self).__init__()
        self.bert_config = BertConfig()
        self.bert = BertModel.from_pretrained(config_arg.model_type)
        self.dropout = nn.Dropout(self.bert_config.hidden_dropout_prob)
        self.fc = nn.Linear(self.bert_config.hidden_size, config_arg.n_labels)
        self.crf_layer = CRF(num_tags=config_arg.n_labels, batch_first=True)

    def forward(self, input_ids, attention_mask_arg, labels_arg):
        outputs = self.bert(input_ids, attention_mask=attention_mask_arg)
        sequence_output = outputs[0] 
        sequence_output = self.dropout(sequence_output)
        logits = self.fc(sequence_output)
        log_probabs = self.crf_layer(logits, labels_arg, mask=attention_mask_arg.to(dtype=torch.uint8))

        return log_probabs

p.s. I use PyTorch 1.1.0. Is that the problem?

update

I run the same conde in Colab and works fine. There must be a problem with the libraries versions.

Hmm that's weird. Your code looks OK. By default the return value should really be a scalar tensor, unless you pass reduction='none' as argument. Not sure what happens here. You could check if

return llh.sum()
is actually executed when your code is run.

Thank you @kmkurn for your fast reply, I updated my OP.

Hello @kmkurn, did you have any update on this? What confuses me here is that the tensor has two elements.

Btw, how can I check if return lln.sum() is executed?

Hi, sorry for the (very) late response. Can you check that the dimension of logits is as expected? Also, to check if a line is executed, you can put breakpoints and run Python's debugger pdb.

I updated to to last versions of the libraries and now it works fine. It was probably something with PyTorch.