kssteven418/I-BERT

(huggingface) The output of IBERT is float. Am I doing wrong?

kyoungrok0517 opened this issue · 0 comments

❓ Questions and Help

What is your question?

I'm using the huggingface's implementation. Even though I set the quant_mode=True, I see the output of IBert is in float32 type. Am I using the model wrong, or is it expected?

Code

self.bert = AutoModel.from_pretrained(
    base_model, quant_mode=quant_mode, add_pooling_layer=False
)

...


def forward(
        self,
        input_ids: Tensor,
        attention_mask: Tensor,
        k: int = None,
        return_layers: List[int] = None,
        return_orig: bool = False,
    ):
        bert_out = self.bert(
            input_ids,
            attention_mask=attention_mask,
            output_hidden_states=True,
            return_dict=True,
        )

        # the output dtype is float32!
        print(bert_out.hidden_states[0])

What's your environment?

  • PyTorch Version: 1.7.1
  • OS (e.g., Linux): Ubuntu 20
  • How you installed fairseq (pip, source): No
  • Python version: 3.8.5
  • CUDA/cuDNN version: 11.0
  • GPU models and configuration: RTX 3090