declare-lab/Multimodal-Infomax

Dataset loading problem

JunsFu opened this issue · 3 comments

JunsFu commented

Hello, first of all thank you for your hard work!
My question is: after downloading the MOSI data set, I encountered this problem during the loading of the data set:
Traceback (most recent call last):
File "F:/MLP/2023_Summer/Git/MMIM/src/main.py", line 57, in
solver.train_and_eval()
File "F:\MLP\2023_Summer\Git\MMIM\src\solver.py", line 281, in train_and_eval
train_loss = train(model, optimizer_main, criterion, 1)
File "F:\MLP\2023_Summer\Git\MMIM\src\solver.py", line 128, in train
for i_batch, batch_data in enumerate(self.train_loader):
File "F:\Anaconda\envs\self_mm\lib\site-packages\torch\utils\data\dataloader.py", line 435, in next
data = self._next_data()
File "F:\Anaconda\envs\self_mm\lib\site-packages\torch\utils\data\dataloader.py", line 475, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "F:\Anaconda\envs\self_mm\lib\site-packages\torch\utils\data_utils\fetch.py", line 47, in fetch
return self.collate_fn(data)
File "F:\MLP\2023_Summer\Git\MMIM\src\data_loader.py", line 135, in collate_fn
bert_sentences = torch.LongTensor([sample["input_ids"] for sample in bert_details])
ValueError: expected sequence of length 50 at dim 1 (got 39)
image

can you remove

text, max_length=SENT_LEN, add_special_tokens=True, truncation=True, padding='max_length')

'truncation=True' and see how it works

JunsFu commented

Thank you for your reply. I have done what you said but the same problem still occurs.
image

JunsFu commented

I found the reason. The version of transformers is wrong and I need to use version 4.0.