naver/sqlova

pickle problem in train

farsmile opened this issue · 2 comments

Hi guys:
Everything before training goes well. However, when i got in epoches, problem is as following. Have I do something wrong?

Microsoft Windows [版本 10.0.17134.556]
(c) 2018 Microsoft Corporation。保留所有权利。

(venv) C:\PycharmProjects\sqlova>python train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_le
ng 222
BERT-type: uncased_L-12_H-768_A-12
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: True
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Seq-to-SQL: the number of final BERT layers to be used: 2
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001
Traceback (most recent call last):
File "train.py", line 591, in
dset_name='train')
File "train.py", line 211, in train
for iB, t in enumerate(train_loader):
File "C:\PycharmProjects\sqlova\venv\lib\site-packages\torch\utils\data\dataloader.py", line 822, in iter
return _DataLoaderIter(self)
File "C:\PycharmProjects\sqlova\venv\lib\site-packages\torch\utils\data\dataloader.py", line 563, in init
w.start()
File "C:\Python36\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Python36\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Python36\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Python36\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\Python36\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'get_loader_wikisql..'

(venv) C:\PycharmProjects\sqlova>Traceback (most recent call last):
File "", line 1, in
File "C:\Python36\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Python36\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

(venv) C:\PycharmProjects\sqlova>

Hi @farsmile

The problem seems to be caused by lambda function in torch.utils.data.DataLoader

def get_loader_wikisql(data_train, data_dev, bS, shuffle_train=True, shuffle_dev=False):
train_loader = torch.utils.data.DataLoader(
batch_size=bS,
dataset=data_train,
shuffle=shuffle_train,
num_workers=4,
collate_fn=lambda x: x # now dictionary values are not merged!
)

which is, according to this link, the problem between pytorch and Windows.
(Sorry I couldn't test this by myself as I don't have Window machine with GPU).

Using a custom data_loader function may solve this problem.

Thanks.

Wonseok

Hi Wonseok:
it works now and your link is helpful!

Thanks.