naver/sqlova

RuntimeError: CUDA error: out of memory

Closed this issue · 2 comments

BERT-type: uncased_L-12_H-768_A-12
Batch_size = 8
BERT parameters:
learning rate: 1e-05
Fine-tune BERT: True
vocab size: 30522
hidden_size: 768
num_hidden_layer: 12
num_attention_heads: 12
hidden_act: gelu
intermediate_size: 3072
hidden_dropout_prob: 0.1
attention_probs_dropout_prob: 0.1
max_position_embeddings: 512
type_vocab_size: 2
initializer_range: 0.02
Load pre-trained parameters.
Traceback (most recent call last):
File "train.py", line 576, in
model, model_bert, tokenizer, bert_config = get_models(args, BERT_PT_PATH)
File "train.py", line 156, in get_models
args.no_pretraining)
File "train.py", line 126, in get_bert
model_bert.to(device)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 386, in to
return self._apply(convert)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 193, in _apply
module._apply(fn)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 193, in _apply
module._apply(fn)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 199, in _apply
param.data = fn(param.data)
File "/home/ubuntu/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 384, in convert
return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
RuntimeError: CUDA error: out of memory

solved

Coulud you tell us how to decrease cuda memory use?