Stemmer Not Working as Expected
thedamnedrhino opened this issue · 2 comments
I am executing the script using perl perstem.pl < input > output
. input
is a file containing the single line
کتابهاصفحه ها صفحهی صفحه ی کار کن کارکن
I would expect many changes to be made to this line through running a stemmer on it, however the output file contains the exact same data as the input data (the diff
shell command shows no difference between the two files). I have also tried out all the relevant options (-s --irreg-stem -t 1
) to no avail. I have also tried the perl perstem.pl < input | cat > output
command to execute the script, but the result was the same.
I have the same problem , and also I get this message :
Use of the encoding pragma is deprecated at perstem.pl line 133. Use of the encoding pragma is deprecated at perstem.pl line 137.
Just replace use encoding "utf8"
with use utf8
and it should work :)