Error POS Training
wannaphong opened this issue · 10 comments
I using
python nlpnet-train.py pos --gold /home/wannaphong/thainlp/nlpnet/thaipostag.txt
but It' error.
Reading training data...
Traceback (most recent call last):
File "nlpnet-train.py", line 244, in <module>
text_reader = create_reader(args, md)
File "nlpnet-train.py", line 48, in create_reader
text_reader = pos.POSReader(md, filename=filename)
File "/usr/local/lib/python2.7/dist-packages/nlpnet/pos/pos_reader.py", line 36, in __init__
self._read_conll(filename)
File "/usr/local/lib/python2.7/dist-packages/nlpnet/pos/pos_reader.py", line 84, in _read_conll
word = fields[ConllPos.word]
IndexError: list index out of range
Data using in POS Training : https://gist.github.com/wannaphongcom/3c366ef6f7f499df1aaa842a8adae04a
The index out of range means that some line does not have the necessary fields. Can you double check your file to make sure all lines have the three fields?
It all lines have the three fields. https://gist.github.com/wannaphongcom/f699176c44f9abde19080a627da22b6c
I just found the problem: nlpnet expects the POS tag to be in the fourth column, as shown here, while your data have it in the third one. This is the default CoNLL format.
@erickrf I try to make a fourth column, but it is also an error exists. https://gist.github.com/wannaphongcom/14bd7805a84a9aa3a974c9714c692d31
Reading training data...
Traceback (most recent call last):
File "nlpnet-train.py", line 244, in <module>
text_reader = create_reader(args, md)
File "nlpnet-train.py", line 48, in create_reader
text_reader = pos.POSReader(md, filename=filename)
File "/usr/local/lib/python2.7/dist-packages/nlpnet/pos/pos_reader.py", line 36, in __init__
self._read_conll(filename)
File "/usr/local/lib/python2.7/dist-packages/nlpnet/pos/pos_reader.py", line 84, in _read_conll
word = fields[ConllPos.word]
IndexError: list index out of range
Ok, I found and fixed a bug in the code for reading data. It is working now.
Hello, I am getting a similar error. Even after creating the 4th and even to the 10th column the same error appears.
`Reading training data...
Traceback (most recent call last):
File "/usr/local/bin/nlpnet-train.py", line 4, in
import('pkg_resources').run_script('nlpnet==1.2.3', 'nlpnet-train.py')
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/init.py", line 661, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/local/lib/python2.7/dist-packages/pkg_resources/init.py", line 1448, in run_script
exec(script_code, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/nlpnet-1.2.3-py2.7-linux-x86_64.egg/EGG-INFO/scripts/nlpnet-train.py", line 244, in
File "/usr/local/lib/python2.7/dist-packages/nlpnet-1.2.3-py2.7-linux-x86_64.egg/EGG-INFO/scripts/nlpnet-train.py", line 57, in create_reader
File "build/bdist.linux-x86_64/egg/nlpnet/srl/srl_reader.py", line 65, in init
File "build/bdist.linux-x86_64/egg/nlpnet/srl/srl_reader.py", line 114, in _read_conll
IndexError: list index out of range
`
Do you have any ideas what is happening?
@juanmed hard to say without seeing your data. Are you sure there are no flaws, such as lines missing some columns?
I verified the data following your comments at the package webpage and inside the source code, and it seems to be correct. There is definitely a problem but I cannot see it. Can I send you a small .txt file for you to look at the data? Do you have any email?
I verified the data following your comments at the package webpage and inside the source code, and it seems to be correct. There is definitely a problem but I cannot see it. Can I send you a small .txt file with a sample for you to look at the data? Do you have any email?
send to erickrfonseca@gmail.com