pickle problem in train

Hi guys:
Everything before training goes well. However, when i got in epoches, problem is as following. Have I do something wrong?

Microsoft Windows [版本 10.0.17134.556]
(c) 2018 Microsoft Corporation。保留所有权利。

(venv) C:\PycharmProjects\sqlova>python train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_le
ng 222
BERT-type: uncased_L-12_H-768_A-12
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: True
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Seq-to-SQL: the number of final BERT layers to be used: 2
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001
Traceback (most recent call last):
File "train.py", line 591, in
dset_name='train')
File "train.py", line 211, in train
for iB, t in enumerate(train_loader):
File "C:\PycharmProjects\sqlova\venv\lib\site-packages\torch\utils\data\dataloader.py", line 822, in iter
return _DataLoaderIter(self)
File "C:\PycharmProjects\sqlova\venv\lib\site-packages\torch\utils\data\dataloader.py", line 563, in init
w.start()
File "C:\Python36\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Python36\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Python36\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Python36\lib\multiprocessing\popen_spawn_win32.py", line 65, in init
reduction.dump(process_obj, to_child)
File "C:\Python36\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'get_loader_wikisql..'

(venv) C:\PycharmProjects\sqlova>Traceback (most recent call last):
File "", line 1, in
File "C:\Python36\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Python36\lib\multiprocessing\spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

(venv) C:\PycharmProjects\sqlova>

Hi @farsmile

The problem seems to be caused by lambda function in torch.utils.data.DataLoader

sqlova/sqlova/utils/utils_wikisql.py

Lines 91 to 98 in b7ce9ad

    
           def get_loader_wikisql(data_train, data_dev, bS, shuffle_train=True, shuffle_dev=False): 
        
               train_loader = torch.utils.data.DataLoader( 
        
                   batch_size=bS, 
        
                   dataset=data_train, 
        
                   shuffle=shuffle_train, 
        
                   num_workers=4, 
        
                   collate_fn=lambda x: x  # now dictionary values are not merged! 
        
               )

which is, according to this link, the problem between pytorch and Windows.
(Sorry I couldn't test this by myself as I don't have Window machine with GPU).

Using a custom data_loader function may solve this problem.

Thanks.

Wonseok

Hi Wonseok:
it works now and your link is helpful!

Thanks.

	def get_loader_wikisql(data_train, data_dev, bS, shuffle_train=True, shuffle_dev=False):
	train_loader = torch.utils.data.DataLoader(
	batch_size=bS,
	dataset=data_train,
	shuffle=shuffle_train,
	num_workers=4,
	collate_fn=lambda x: x # now dictionary values are not merged!
	)