`ValueError` occurred with `representation_model`
JINHXu opened this issue · 0 comments
JINHXu commented
Hello,
the following error occurred when I attempt to obtain sentence embeddings for a list of sentences with combine_strategy=None
:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[12], line 2
1 # features = model.encode_sentences(batch, combine_strategy="mean")
----> 2 features = model.encode_sentences(batch, combine_strategy=None)
File ~/.local/lib/python3.9/site-packages/simpletransformers/language_representation/representation_model.py:219, in RepresentationModel.encode_sentences(self, text_list, combine_strategy, batch_size)
214 token_vectors = self.model(
215 input_ids=encoded["input_ids"].to(self.device),
216 attention_mask=encoded["attention_mask"].to(self.device),
217 )
218 embeddings.append(embedding_func(token_vectors).cpu().detach().numpy())
--> 219 embeddings = np.concatenate(embeddings, axis=0)
221 return embeddings
File <__array_function__ internals>:180, in concatenate(*args, **kwargs)
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 512 and the array at index 823 has size 150
while this error has not been witnessed with combine_strategy="mean"
, should this have been a bug?
(I am setting combine_strategy=None
in order to obtain [CLS] embedding)
Thank you,
Xu