allanj/pytorch_neural_crf

Segmentation fault

shuu-tatsu opened this issue · 2 comments

Hi,
I was running trainer.py with gpu.
Then I got follow error:

device: cpu seed: 42 digit2zero: True dataset: conll2003 embedding_file: data/glove.6B.100d.txt embedding_dim: 100 optimizer: sgd learning_rate: 0.01 momentum: 0.0 l2: 1e-08 lr_decay: 0 batch_size: 10 num_epochs: 100 train_num: -1 dev_num: -1 test_num: -1 model_folder: english_model hidden_dim: 200 dropout: 0.5 use_char_rnn: 1 context_emb: none reading the pretraing embedding: data/glove.6B.100d.txt 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 400000/400000 [00:40<00:00, 9893.97it/s] Reading file: data/conll2003/train.txt 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 217662/217662 [00:01<00:00, 161670.34it/s] number of sentences: 14041 Reading file: data/conll2003/dev.txt 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 54612/54612 [00:00<00:00, 159794.60it/s] number of sentences: 3250 Reading file: data/conll2003/test.txt 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 49888/49888 [00:00<00:00, 186748.58it/s] number of sentences: 3453 #labels: 20 label 2idx: {'<PAD>': 0, 'S-ORG': 1, 'O': 2, 'S-MISC': 3, 'B-PER': 4, 'E-PER': 5, 'S-LOC': 6, 'B-ORG': 7, 'E-ORG': 8, 'I-PER': 9, 'S-PER': 10, 'B-MISC': 11, 'I-MISC': 12, 'E-MISC': 13, 'I-ORG': 14, 'B-LOC': 15, 'E-LOC': 16, 'I-LOC': 17, '<START>': 18, '<STOP>': 19} Building the embedding table for vocabulary... [Info] Use the pretrained word embedding to initialize: 25305 x 100 num chars: 77 num words: 25305 [Info] Building character-level LSTM [Model Info] Input size to LSTM: 150 [Model Info] LSTM Hidden Size: 200 [Model Info] Final Hidden Size: 200 Using SGD: lr is: 0.01, L2 regularization is: 1e-08 number of instances: 14041 [Shuffled] Shuffle the training instance ids [Info] The model will be saved to: english_model.tar.gz learning rate is set to: 0.01 Segmentation fault (core dumped)

Segmentation fault could be caused by many different errors. Could you provide the following details as well? (I clone the repo to in a new laptop and it works well though)

  1. You mention you are using GPU, probably you need to set the argument --device to cuda:0. And see if it works?
  2. What PyTorch version are you using? (The latest version should work)
  3. Does it happen when you use a small amount of data? (set the --train_num to 100 for example)
  4. Does it happen right before the training epoch start or in the middle of training? (You can print something in the for loop of the train_model function)

It works, Thanks!
You mention you are using GPU, probably you need to set the argument --device to cuda:0. And see if it works?