mtlmodel.py error

Question

mtlmodel.py error

Closed this issue a year ago · 0 comments

hi

getting an error only when runing on a list of files.
when ruining on that same file only it suns ok.

(nlp_env) F:\nlp_project\HebPipe\hebpipe>python heb_pipe.py  "F:\nlp_project\responsa_texts\all files\all files\*.txt"  --dirout "F:\nlp_project\responsa_texts\hebpipe_output\all files"  --cpu
! You selected no processing options
! Assuming you want all processing steps

Running tasks:
====================
o Automatic sentence splitting (neural)
o Whitespace tokenization
o Morphological segmentation
o POS and Morphological tagging
o Lemmatization
o Dependency parsing
o Entity recognition
o Coreference resolution

Downloading https://raw.githubusercontent.com/stanfordnlp/stanza-resources/main/resources_1.5.0.json: 216kB [00:00, ?B/s]
Some weights of BertModel were not initialized from the model checkpoint at onlplab/alephbert-base and are newly initialized: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Some weights of BertModel were not initialized from the model checkpoint at onlplab/alephbert-base and are newly initialized: ['bert.pooler.dense.weight', 'bert.pooler.dense.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Using bos_token, but it is not set yet.
Using eos_token, but it is not set yet.
Processing שו ת אבני נזר חלק אה ע סימן א.txt
C:\Users\msperka\AppData\Local\anaconda3\envs\nlp_env\lib\site-packages\sklearn\base.py:324: UserWarning: Trying to unpickle estimator LabelEncoder from version 0.23.2 when using version 1.0.1. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:
https://scikit-learn.org/stable/modules/model_persistence.html#security-maintainability-limitations
  warnings.warn(
Processing שו ת אבני נזר חלק אה ע סימן ב.txt
Processing שו ת אבני נזר חלק אה ע סימן ג.txt
Processing שו ת אבני נזר חלק אה ע סימן ד.txt
Processing שו ת אבני נזר חלק אה ע סימן ה.txt
Processing שו ת אבני נזר חלק אה ע סימן ו.txt
Processing שו ת אבני נזר חלק אה ע סימן ז.txt
Processing שו ת אבני נזר חלק אה ע סימן ח.txt
Processing שו ת אבני נזר חלק אה ע סימן ט.txt
Traceback (most recent call last):
  File "heb_pipe.py", line 851, in <module>
    run_hebpipe()
  File "heb_pipe.py", line 828, in run_hebpipe
    processed = nlp(input_text, do_whitespace=opts.whitespace, do_to
[שו ת אבני נזר חלק אה ע סימן ט.txt](https://github.com/amir-zeldes/DepEdit/files/12388027/default.txt)
k=dotok, do_tag=opts.posmorph, do_lemma=opts.lemma,
  File "heb_pipe.py", line 613, in nlp
    tagged_conllu, tokenized, morphs, words = mtltagger.predict(tokenized,sent_tag=sent_tag,checkpointfile=model_dir + 'heb.sbdposmorph.pt')
  File "F:\nlp_project\HebPipe\hebpipe\lib\mtlmodel.py", line 1273, in predict
    split_indices, pos_tags, morphs, words = self.inference(no_pos_lemma,sent_tag=sent_tag,checkpointfile=checkpointfile)
  File "F:\nlp_project\HebPipe\hebpipe\lib\mtlmodel.py", line 1015, in inference
    for i in range(0, len(preds)):
TypeError: object of type 'int' has no len()
Elapsed time: 0:57:44.609
========================================`

שו ת אבני נזר חלק אה ע סימן ט.txt