`ValueError` occurred with `representation_model`

Question

`ValueError` occurred with `representation_model`

JINHXu opened this issue 8 months ago · 0 comments

Hello,

the following error occurred when I attempt to obtain sentence embeddings for a list of sentences with combine_strategy=None:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[12], line 2
      1 # features = model.encode_sentences(batch, combine_strategy="mean")
----> 2 features = model.encode_sentences(batch, combine_strategy=None)

File ~/.local/lib/python3.9/site-packages/simpletransformers/language_representation/representation_model.py:219, in RepresentationModel.encode_sentences(self, text_list, combine_strategy, batch_size)
    214             token_vectors = self.model(
    215                 input_ids=encoded["input_ids"].to(self.device),
    216                 attention_mask=encoded["attention_mask"].to(self.device),
    217             )
    218     embeddings.append(embedding_func(token_vectors).cpu().detach().numpy())
--> 219 embeddings = np.concatenate(embeddings, axis=0)
    221 return embeddings

File <__array_function__ internals>:180, in concatenate(*args, **kwargs)

ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 512 and the array at index 823 has size 150

while this error has not been witnessed with combine_strategy="mean", should this have been a bug?

(I am setting combine_strategy=None in order to obtain [CLS] embedding)

Thank you,
Xu