last_hidden_states being string instead of Tensor
movabo opened this issue · 4 comments
movabo commented
21:40:47 INFO diskdict(20):__init__|: loaded DiskDict with 6726 items from knowledge/bingliuopinion/opinion_polarity.ddict
21:40:47 INFO diskdict(20):__init__|: loaded DiskDict with 6886 items from knowledge/mpqasubjectivity/subjclueslen1-HLTEMNLP05.tff.ddict
21:40:47 INFO diskdict(20):__init__|: loaded DiskDict with 6468 items from knowledge/nrcemolex/NRC-Emotion-Lexicon-Wordlevel-v0.92.txt.ddict
21:40:49 INFO infer(25):__init__|: overwriting: own_model_name=None to grutsc
21:40:49 INFO infer(25):__init__|: overwriting: default_lm=bert-base-uncased to roberta-base
21:40:49 INFO infer(25):__init__|: overwriting: state_dict=None to grutsc
21:40:49 INFO infer(25):__init__|: overwriting: knowledgesources=[] to nrc_emotions mpqa_subjectivity bingliu_opinion
21:40:49 INFO train(1045):prepare_and_start_instructor|: set default language model to roberta-base
21:40:49 INFO train(1038):post_process_arguments|: updated total number of categories to 10 with EKS nrc_emotions
21:40:49 INFO train(1038):post_process_arguments|: updated total number of categories to 13 with EKS mpqa_subjectivity
21:40:49 INFO train(1038):post_process_arguments|: updated total number of categories to 15 with EKS bingliu_opinion
21:40:49 INFO train(1064):prepare_and_start_instructor|: set number of polarity classes to 3
21:40:49 INFO train(1071):prepare_and_start_instructor|: no random seed was given, using system time
21:40:49 INFO train(1072):prepare_and_start_instructor|: setting random seed: 1621885249
21:40:49 INFO train(911):_setup_cuda|: cuda information
21:40:49 INFO train(912):_setup_cuda|: scc SGE_GPU: None
21:40:49 INFO train(913):_setup_cuda|: arg: cuda device: None
21:40:49 INFO train(936):_setup_cuda|: using CPU
21:40:49 INFO train(223):create_transformer_model|: creating model for weights name: roberta-base
21:40:49 INFO train(239):create_transformer_model|: using model_path: roberta-base
Some weights of the model checkpoint at roberta-base were not used when initializing RobertaModel: ['lm_head.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'lm_head.layer_norm.bias', 'lm_head.dense.bias', 'lm_head.layer_norm.weight']
- This IS expected if you are initializing RobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
21:40:55 INFO train(121):__init__|: initialized transformer tokenizers and models
21:40:55 INFO train(148):__init__|: loading weights from pretrained_models/state_dicts/grutsc...
21:40:57 INFO train(151):__init__|: done
21:40:57 INFO train(153):__init__|: initialized own model
21:40:57 INFO train(212):_print_args|: n_trainable_params: 153015555, n_nontrainable_params: 0
21:40:57 INFO train(215):_print_args|: > training arguments:
21:40:57 INFO train(217):_print_args|: >>> training_mode: False
21:40:57 INFO train(217):_print_args|: >>> own_model_name: grutsc
21:40:57 INFO train(217):_print_args|: >>> dataset_name: None
21:40:57 INFO train(217):_print_args|: >>> data_format: None
21:40:57 INFO train(217):_print_args|: >>> optimizer: adam
21:40:57 INFO train(217):_print_args|: >>> initializer: xavier_uniform_
21:40:57 INFO train(217):_print_args|: >>> learning_rate: 2e-05
21:40:57 INFO train(217):_print_args|: >>> dropout: 0.1
21:40:57 INFO train(217):_print_args|: >>> l2reg: 0.01
21:40:57 INFO train(217):_print_args|: >>> num_epoch: 10
21:40:57 INFO train(217):_print_args|: >>> batch_size: 64
21:40:57 INFO train(217):_print_args|: >>> log_step: 5
21:40:57 INFO train(217):_print_args|: >>> max_seq_len: 150
21:40:57 INFO train(217):_print_args|: >>> polarities_dim: 3
21:40:57 INFO train(217):_print_args|: >>> device: cpu
21:40:57 INFO train(217):_print_args|: >>> seed: 1621885249
21:40:57 INFO train(217):_print_args|: >>> local_context_focus: cdm
21:40:57 INFO train(217):_print_args|: >>> SRD: 3
21:40:57 INFO train(217):_print_args|: >>> snem: f1_macro
21:40:57 INFO train(217):_print_args|: >>> devmode: False
21:40:57 INFO train(217):_print_args|: >>> experiment_path: ./
21:40:57 INFO train(217):_print_args|: >>> balancing: None
21:40:57 INFO train(217):_print_args|: >>> spc_lm_representation: mean_last
21:40:57 INFO train(217):_print_args|: >>> spc_input_order: text_target
21:40:57 INFO train(217):_print_args|: >>> use_early_stopping: False
21:40:57 INFO train(217):_print_args|: >>> eval_only_after_last_epoch: False
21:40:57 INFO train(217):_print_args|: >>> pretrained_model_name: None
21:40:57 INFO train(217):_print_args|: >>> state_dict: pretrained_models/state_dicts/grutsc
21:40:57 INFO train(217):_print_args|: >>> single_targets: True
21:40:57 INFO train(217):_print_args|: >>> multi_targets: False
21:40:57 INFO train(217):_print_args|: >>> loss: crossentropy
21:40:57 INFO train(217):_print_args|: >>> targetclasses: newsmtsc3
21:40:57 INFO train(217):_print_args|: >>> knowledgesources: ('nrc_emotions', 'mpqa_subjectivity', 'bingliu_opinion')
21:40:57 INFO train(217):_print_args|: >>> is_use_natural_target_phrase_for_spc: False
21:40:57 INFO train(217):_print_args|: >>> default_lm: roberta-base
21:40:57 INFO train(217):_print_args|: >>> run_id: 0
21:40:57 INFO train(217):_print_args|: >>> coref_mode_in_training: ignore
21:40:57 INFO train(217):_print_args|: >>> base_path: /home/moritz/Documents/Hiwi/NewsMTSC
/home/moritz/anaconda3/envs/newsmtsc/lib/python3.7/site-packages/transformers/tokenization_utils_base.py:2110: FutureWarning: The `pad_to_max_length` argument is deprecated and will be removed in a future version, use `padding=True` or `padding='longest'` to pad to the longest sequence in the batch, or use `padding='max_length'` to pad to a max length. In this case, you can give a specific length with `max_length` (e.g. `max_length=45`) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
FutureWarning,
21:40:57 WARNING dataset(194):_create_word_to_wordpiece_mapping|: overlap when mapping tokens to wordpiece (allow overwriting because Roberta is used)
Traceback (most recent call last):
File "/snap/pycharm-educational/38/plugins/python-ce/helpers/pydev/pydevd.py", line 1483, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/snap/pycharm-educational/38/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/moritz/Documents/Hiwi/NewsMTSC/infer.py", line 155, in <module>
text_right=", you have to admit that he’s an astute reader of politics.",
File "/home/moritz/Documents/Hiwi/NewsMTSC/infer.py", line 88, in infer
outputs = self.model(inputs)
File "/home/moritz/anaconda3/envs/newsmtsc/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/moritz/Documents/Hiwi/NewsMTSC/models/singletarget/grutscsingle.py", line 132, in forward
(last_hidden_states, knowledge_embedded), dim=2
TypeError: expected Tensor as element 0 in argument 0, but got str
last_hidden_states
has indeed the string value “last_hidden_states” (i.e. last_hidden_states = “last_hidden_states”
) after the statement in
NewsMTSC/models/singletarget/grutscsingle.py
Lines 102 to 106 in aaa358b
fhamborg commented
@movabo this should be fixed when you're using the latest repo. if not, pls reopen
movabo commented
Unfortunately it seems that this problem persists.
I just tried the newest version of the repository. The only changes were the fixed imports (of bert_modeling).
fhamborg commented
- did you delete the conda environment and create a new one using the (updated) instructions from the (updated) readme?
- if yes, does running the infer.py as is in the repo (without any changes) work? if not, pls post the stacktrace
movabo commented
Ah, I did not see that you also changed the required PyTorch version. With 1.7.1 it seems to work! 👍