Hi guys:
Everything before training goes well. However, when i got in epoches, problem is as following. Have I do something wrong?

python --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_length 222
ng 222
BERT-type: uncased_L-12_H-768_A-12
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: True
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Seq-to-SQL: the number of final BERT layers to be used: 2
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001
Hi @farsmile

The problem seems to be caused by lambda function in

def get_loader_wikisql(data_train, data_dev, bS, shuffle_train=True, shuffle_dev=False):
train_loader =
collate_fn=lambda x: x # now dictionary values are not merged!

which is, according to this link, the problem between pytorch and Windows.
(Sorry I couldn't test this by myself as I don't have Window machine with GPU).

Hi Wonseok:
it works now and your link is helpful!
