ieriii/spacy-annotator

TypeError:

Closed this issue · 5 comments

Hi,
Thanks for creating this, I'm looking forward to using to generate NER training data.

However, it is throwing an error when I try to run the annotator, even when I just press "submit" with no entities labeled.

I am using a custom model trained with spacy 2.2.4. Other potentially relevant info:
Python=3.6 (have also tried 3.7)
pandas=1.0.3
re=2.2.1
ipywidgets=7.5.1

This is the traceback:

~/Smith_Scripts/NLP_GeneExpression/spacy-annotator/annotator/active_annotations.py in on_click(btn)
    185     btn = Button(description='submit')
    186     def on_click(btn):
--> 187         add_annotation(sample, ta.value, regex_flags)
    188         ta.value = reset_textarea()
    189     btn.on_click(on_click)

~/Smith_Scripts/NLP_GeneExpression/spacy-annotator/annotator/active_annotations.py in add_annotation(df, annotation, regex_flags)
    138 
    139                     if item:   # This controls for potential input error such as input empty string after comma
--> 140                         r = re.compile(f'\\b{item}\\b', flags=regex_flags)
    141                         spans.extend([(span.start(), span.end(), label) for span in r.finditer(sample[col_text][current_index])])
    142 

~/anaconda3/envs/NLP2/lib/python3.6/re.py in compile(pattern, flags)
    231 def compile(pattern, flags=0):
    232     "Compile a regular expression pattern, returning a pattern object."
--> 233     return _compile(pattern, flags)
    234 
    235 def purge():

~/anaconda3/envs/NLP2/lib/python3.6/re.py in _compile(pattern, flags)
    299     if not sre_compile.isstring(pattern):
    300         raise TypeError("first argument must be string or compiled pattern")
--> 301     p = sre_compile.compile(pattern, flags)
    302     if not (flags & DEBUG):
    303         if len(_cache) >= _MAXCACHE:

~/anaconda3/envs/NLP2/lib/python3.6/sre_compile.py in compile(p, flags)
    560     if isstring(p):
    561         pattern = p
--> 562         p = sre_parse.parse(p, flags)
    563     else:
    564         pattern = None

~/anaconda3/envs/NLP2/lib/python3.6/sre_parse.py in parse(str, flags, pattern)
    853 
    854     try:
--> 855         p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)
    856     except Verbose:
    857         # the VERBOSE flag was switched on inside the pattern.  to be

TypeError: unsupported operand type(s) for &: 'NoneType' and 'int' ```

Hi! Happy to have a look.
Can you provide a minimal reproducible example?
I couldn't reproduce the error.

Thanks :)

Hm, so this error results from an almost exact replica of your example code:

import pandas as pd
import os
import re
from annotator.active_annotations import annotate

resultDirectory = '/home/smith/Spacy_Models/NERtrained_062420/'
file = 'NER_Results_Signaling_0to10M.csv'

df = pd.read_csv(os.path.join(resultDirectory, file), index_col=0, encoding='utf8')

dd = annotate(df,
            col_text = 'Sentence',
            labels = ['REGION', 'PERSON', 'DATE'],
            sample_size=1,
            model = 'en_core_web_lg',
            regex_flags=re.IGNORECASE
             )

I've since found that it works properly if I omit the regex_flegs argument of the annotate function, but passing any argument there at all (including regex_flags=None) throws the above error. The regex flags aren't a huge deal for my annotation purposes but if you happen to know a solution that would be great.

Thanks!

Hi,
Glad to hear you found a temporary solution.

To investigate the issue, it would be good if you can provide a small toy dataframe which I can use to reproduce the error :)

I suspect you are passing a variable as a number and not as a string. But I might be wrong. It would be better to test it so that I can then improve instructions for everyone.

Close for lack of engagement/information from the issue author.