pattern token with colon characters in the value field
Closed this issue · 1 comments
nicolashernandez commented
The error can be reproduced with the following command line:
python3 pyrata_re.py 'pos="PRO:PER"' "[{'lemma': 'se', 'pos': 'PRO:PER'}]" --pyrata_data --log
The stdout gives
Traceback (most recent call last):
File "pyrata_re.py", line 137, in main
result = compiled_nfa.search(s, mode = mode, pos = pos, endpos = endpos)
File "/media/hernandez-n/ext4/workspace/17/PyRATA/pyrata/pyrata/compiled_pattern.py", line 563, in search
an_nfa.step(c, self.lexicons)
File "/media/hernandez-n/ext4/workspace/17/PyRATA/pyrata/pyrata/nfa.py", line 203, in step
states_add.update(self.__step_special_state(char, None, cs, lexicons))
File "/media/hernandez-n/ext4/workspace/17/PyRATA/pyrata/pyrata/nfa.py", line 328, in __step_special_state
states_add.update(self.__step_special_state(char, state, os, lexicons))
File "/media/hernandez-n/ext4/workspace/17/PyRATA/pyrata/pyrata/nfa.py", line 237, in __step_special_state
step_evaluation = state.symbolic_step_expression[0].subs(substitution_list)
AttributeError: 'tuple' object has no attribute 'subs'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "pyrata_re.py", line 299, in <module>
main() # (sys.argv) # FIXME sys ?
File "pyrata_re.py", line 140, in main
except pyrata.nfa.CompiledPattern.InvalidRegexPattern as e:
AttributeError: module 'pyrata.nfa' has no attribute 'CompiledPattern'
nicolashernandez commented
The problem comes from the use of sympy dependency which interprets the colon ':' character as read in [1].
In syntactic_step_parser.py
,
The line
var[indice] = symbols(single_constraint_string.replace(' ','\\ '))
is rewritten into
var[indice] = symbols(single_constraint_string.replace(' ','\\ ').replace(':','\\:'))
To solve the issue, I have just escaped the character. Will be present in the next push.
[1] http://docs.sympy.org/1.0/_modules/sympy/core/symbol.html.