ValueError: not enough values to unpack (expected 4, got 3)
Kailash-Natarajan opened this issue · 3 comments
I get this error while training on google colab. Started happening recently. Any suggestions/solutions for rectifying this?
Works fine with POS tagging disabled because it skips the problematic code segment.
If I noted correctly, it happens only for trainset_generator, doesn't happen for validset_generator, although I do not know this for sure.
Training...
0%| | 0/2237 [00:00<?, ?it/s]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-3-6a92f2de38ba> in <module>()
7 re=Restorer()
----> 8 re.train()
7 frames
/content/drive/MyDrive/ColabNotebooks/funnel/main/train.py in train(self)
230 # training set data loader
231 trainset_generator = tqdm(self.trainset_generator)
--> 232 for data in trainset_generator:
233 raw_data, train_data = data
234 train_data = (torch.LongTensor(i).to(self.config.device) for i in train_data)
/usr/local/lib/python3.7/dist-packages/tqdm/std.py in __iter__(self)
1183
1184 try:
-> 1185 for obj in iterable:
1186 yield obj
1187 # Update and possibly print the progressbar.
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py in __next__(self)
519 if self._sampler_iter is None:
520 self._reset()
--> 521 data = self._next_data()
522 self._num_yielded += 1
523 if self._dataset_kind == _DatasetKind.Iterable and \
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py in _next_data(self)
559 def _next_data(self):
560 index = self._next_index() # may raise StopIteration
--> 561 data = self._dataset_fetcher.fetch(index) # may raise StopIteration
562 if self._pin_memory:
563 data = _utils.pin_memory.pin_memory(data)
/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py in fetch(self, possibly_batched_index)
42 def fetch(self, possibly_batched_index):
43 if self.auto_collation:
---> 44 data = [self.dataset[idx] for idx in possibly_batched_index]
45 else:
46 data = self.dataset[possibly_batched_index]
/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py in <listcomp>(.0)
42 def fetch(self, possibly_batched_index):
43 if self.auto_collation:
---> 44 data = [self.dataset[idx] for idx in possibly_batched_index]
45 else:
46 data = self.dataset[possibly_batched_index]
/content/drive/MyDrive/ColabNotebooks/funnel/main/src/utils/pipeline.py in __getitem__(self, idx)
188 idx = random.choice(list(self.sample_pool))
189 self.sample_pool.remove(idx)
--> 190 return generate_seq(self.raw_data[idx:idx+self.config.max_seq_len-2], self.config)
191 else:
192 if self.config.use_pos:
/content/drive/MyDrive/ColabNotebooks/funnel/main/src/utils/pipeline.py in generate_seq(lines, config)
90 for l in lines:
91 if config.use_pos:
---> 92 token, pun, mask, tag = l
93 else:
94 # token, pun, mask, tag
ValueError: not enough values to unpack (expected 4, got 3)
A possible reason for this is the missing of the tag item in one line of the data. To ensure the success of generating the POS dataset in the first run, I would suggest print l here to see what exactly is here and take a look at the dataset for the data structure and content of one line in the data.
Issue closed due to no further feedback in one week.
Sorry was caught up. Anyways, I just looked through. Had to regenerate .json files again with POS tagger enabled. Works fine now. Thanks!