TypeError:
Closed this issue · 5 comments
Hi,
Thanks for creating this, I'm looking forward to using to generate NER training data.
However, it is throwing an error when I try to run the annotator, even when I just press "submit" with no entities labeled.
I am using a custom model trained with spacy 2.2.4. Other potentially relevant info:
Python=3.6 (have also tried 3.7)
pandas=1.0.3
re=2.2.1
ipywidgets=7.5.1
This is the traceback:
~/Smith_Scripts/NLP_GeneExpression/spacy-annotator/annotator/active_annotations.py in on_click(btn)
185 btn = Button(description='submit')
186 def on_click(btn):
--> 187 add_annotation(sample, ta.value, regex_flags)
188 ta.value = reset_textarea()
189 btn.on_click(on_click)
~/Smith_Scripts/NLP_GeneExpression/spacy-annotator/annotator/active_annotations.py in add_annotation(df, annotation, regex_flags)
138
139 if item: # This controls for potential input error such as input empty string after comma
--> 140 r = re.compile(f'\\b{item}\\b', flags=regex_flags)
141 spans.extend([(span.start(), span.end(), label) for span in r.finditer(sample[col_text][current_index])])
142
~/anaconda3/envs/NLP2/lib/python3.6/re.py in compile(pattern, flags)
231 def compile(pattern, flags=0):
232 "Compile a regular expression pattern, returning a pattern object."
--> 233 return _compile(pattern, flags)
234
235 def purge():
~/anaconda3/envs/NLP2/lib/python3.6/re.py in _compile(pattern, flags)
299 if not sre_compile.isstring(pattern):
300 raise TypeError("first argument must be string or compiled pattern")
--> 301 p = sre_compile.compile(pattern, flags)
302 if not (flags & DEBUG):
303 if len(_cache) >= _MAXCACHE:
~/anaconda3/envs/NLP2/lib/python3.6/sre_compile.py in compile(p, flags)
560 if isstring(p):
561 pattern = p
--> 562 p = sre_parse.parse(p, flags)
563 else:
564 pattern = None
~/anaconda3/envs/NLP2/lib/python3.6/sre_parse.py in parse(str, flags, pattern)
853
854 try:
--> 855 p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)
856 except Verbose:
857 # the VERBOSE flag was switched on inside the pattern. to be
TypeError: unsupported operand type(s) for &: 'NoneType' and 'int' ```
Hi! Happy to have a look.
Can you provide a minimal reproducible example?
I couldn't reproduce the error.
Thanks :)
Hm, so this error results from an almost exact replica of your example code:
import pandas as pd
import os
import re
from annotator.active_annotations import annotate
resultDirectory = '/home/smith/Spacy_Models/NERtrained_062420/'
file = 'NER_Results_Signaling_0to10M.csv'
df = pd.read_csv(os.path.join(resultDirectory, file), index_col=0, encoding='utf8')
dd = annotate(df,
col_text = 'Sentence',
labels = ['REGION', 'PERSON', 'DATE'],
sample_size=1,
model = 'en_core_web_lg',
regex_flags=re.IGNORECASE
)
I've since found that it works properly if I omit the regex_flegs argument of the annotate function, but passing any argument there at all (including regex_flags=None) throws the above error. The regex flags aren't a huge deal for my annotation purposes but if you happen to know a solution that would be great.
Thanks!
Hi,
Glad to hear you found a temporary solution.
To investigate the issue, it would be good if you can provide a small toy dataframe which I can use to reproduce the error :)
I suspect you are passing a variable as a number and not as a string. But I might be wrong. It would be better to test it so that I can then improve instructions for everyone.
Close for lack of engagement/information from the issue author.