Trying to pretrain with longer max sequence length
DarrenAbramson opened this issue · 5 comments
I've tried to extend your electra openwebtext examples by using your pre-process script with appropriate arguments for max sequence length of 768, as follows:
export PYTHONPATH=.
python pretraining/openwebtext/preprocess.py \
--max-seq-length 768 \
--trg-dir data/openwebtext_features_768 \
--n-dataset-building-processes 8
I then have custom json files for the electra generator and discriminator with the following change from the small_generator.json and small_discriminator.json:
"embedding_size": 768,
When I try to do pretraining with the new tokenization and the changed specification I get what looks like a dimension mismatch during training. I have been careful to pass the modified data folder with the new, longer feature tensors.
@DarrenAbramson Thanks for submitting the issue! Could you also paste the specific error trace that you encountered?
2020-09-21 14:18:05,966 : PyTorch version 1.6.0 available.
2020-09-21 14:22:54,700 : TensorFlow version 2.0.0 available.
2020-09-21 14:23:55,160 : loading configuration file pretraining/openwebtext/medium_generator.json
2020-09-21 14:23:55,162 : Model config ElectraConfig {
"architectures": [
"ElectraForMaskedLM"
],
"attention_probs_dropout_prob": 0.1,
"embedding_size": 768,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 192,
"initializer_range": 0.02,
"intermediate_size": 768,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 768,
"model_type": "electra",
"num_attention_heads": 3,
"num_hidden_layers": 12,
"pad_token_id": 0,
"summary_activation": "gelu",
"summary_last_dropout": 0.1,
"summary_type": "first",
"summary_use_proj": true,
"type_vocab_size": 2,
"vocab_size": 30522
}
2020-09-21 14:23:56,924 : loading configuration file pretraining/openwebtext/medium_discriminator.json
2020-09-21 14:23:56,925 : Model config ElectraConfig {
"architectures": [
"ElectraForPreTraining"
],
"attention_probs_dropout_prob": 0.1,
"embedding_size": 768,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "electra",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pad_token_id": 0,
"summary_activation": "gelu",
"summary_last_dropout": 0.1,
"summary_type": "first",
"summary_use_proj": true,
"type_vocab_size": 2,
"vocab_size": 30522
}
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [32,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [33,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [34,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [35,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [36,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [37,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [38,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [39,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [40,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [41,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [42,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [43,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [44,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [45,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [46,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [47,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [48,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [49,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [50,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [51,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [52,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [53,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [54,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [55,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [56,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [57,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [58,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [59,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [60,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [61,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [62,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [64,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [65,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [66,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [67,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [68,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [69,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [70,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [71,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [72,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [73,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [74,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [75,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [76,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [77,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [78,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [79,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [80,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [81,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [82,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [83,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [84,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [85,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [86,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [87,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [88,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [89,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [90,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [91,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [92,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [93,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [94,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [95,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [32,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [33,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [34,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [35,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [36,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [37,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [38,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [39,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [40,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [41,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [42,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [43,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [44,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [45,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [46,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [47,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [48,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [49,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [50,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [51,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [52,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [53,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [54,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [55,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [56,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [57,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [58,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [59,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [60,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [61,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [62,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [274,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [64,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [65,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [66,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [67,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [68,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [69,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [70,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [71,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [72,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [73,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [74,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [75,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [76,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [77,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [78,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [79,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [80,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [81,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [82,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [83,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [84,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [85,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [86,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [87,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [88,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [89,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [90,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [91,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [92,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [93,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [94,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/THC/THCTensorIndex.cu:272: indexSelectLargeIndex: block: [218,0,0], thread: [95,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Traceback (most recent call last):
File "pretraining/openwebtext/pretrain_medium.py", line 321, in <module>
main()
File "pretraining/openwebtext/pretrain_medium.py", line 317, in main
train(rank=args.gpu, args=args)
File "pretraining/openwebtext/pretrain_medium.py", line 201, in train
loss, loss_mlm, loss_disc, acc_gen, acc_disc, disc_labels, disc_pred = model(input_ids, attention_mask=input_mask, token_type_ids=segment_ids)
@DarrenAbramson I think I may have spotted the error. So the "max_position_embeddings": 512
needs to be increased to 768 as well, or you will hit an out of bounds when trying to get that positional encoding greater than the default (512)
Thanks, I should have seen that. Really appreciate the help.
No problem, I'll see if I can add a line of code to guard against that later tonight. Thanks for trying this out!