naver/sqlova

train.py runs forever...15 hrs on a GTX 960 that i had to abort

Closed this issue · 1 comments

➜ sqlova git:(master) python3 train.py --seed 1 --bS 16 --accumulate_gradients 2 --bert_type_abb uS --fine_tune --lr 0.001 --lr_bert 0.00001 --max_seq_leng 222

BERT-type: uncased_L-12_H-768_A-12
Batch_size = 32
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: True
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Seq-to-SQL: the number of final BERT layers to be used: 2
Seq-to-SQL: the size of hidden dimension = 100
Seq-to-SQL: LSTM encoding layer size = 2
Seq-to-SQL: dropout rate = 0.3
Seq-to-SQL: learning rate = 0.001
/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/functional.py:1386: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")

^C
Traceback (most recent call last):
File "train.py", line 605, in
dset_name='train')
File "train.py", line 241, in train
num_out_layers_n=num_target_layers, num_out_layers_h=num_target_layers)
File "/home/leftnoteasy/borde/sqlova/sqlova/utils/utils_wikisql.py", line 817, in get_wemb_bert
nlu_tt, t_to_tt_idx, tt_to_t_idx = get_bert_output(model_bert, tokenizer, nlu_t, hds, max_seq_length)
File "/home/leftnoteasy/borde/sqlova/sqlova/utils/utils_wikisql.py", line 751, in get_bert_output
all_encoder_layer, pooled_output = model_bert(all_input_ids, all_segment_ids, all_input_mask)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/borde/sqlova/bert/modeling.py", line 396, in forward
all_encoder_layers = self.encoder(embedding_output, extended_attention_mask)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/borde/sqlova/bert/modeling.py", line 326, in forward
hidden_states = layer_module(hidden_states, attention_mask)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/borde/sqlova/bert/modeling.py", line 311, in forward
attention_output = self.attention(hidden_states, attention_mask)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/borde/sqlova/bert/modeling.py", line 272, in forward
self_output = self.self(input_tensor, attention_mask)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/borde/sqlova/bert/modeling.py", line 215, in forward
mixed_query_layer = self.query(hidden_states)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 92, in forward
return F.linear(input, self.weight, self.bias)
File "/home/leftnoteasy/miniconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 1408, in linear
output = input.matmul(weight.t())
KeyboardInterrupt
^C

Hi,have u figured out the reason for this.