Spacy V3 decorator string name

Question

Spacy V3 decorator string name

rennanvoa2 opened this issue 4 years ago · 5 comments

Hello guys,
With the V3 update when I run the example code it complains:

ValueError: [E966] `nlp.add_pipe` now takes the string name of the registered component factory, not a callable component. Expected string, but got <spacy_cld.spacy_cld.LanguageDetector object at 0x7fb8d9051ed0> (name: 'None').

- If you created your component with `nlp.create_pipe('name')`: remove nlp.create_pipe and call `nlp.add_pipe('name')` instead.

- If you passed in a component like `TextCategorizer()`: call `nlp.add_pipe` with the string name instead, e.g. `nlp.add_pipe('textcat')`.

- If you're using a custom component: Add the decorator `@Language.component` (for function components) or `@Language.factory` (for class components / factories) to your custom component and assign it a name, e.g. `@Language.component('your_name')`. You can then run `nlp.add_pipe('your_name')` to add it to the pipeline.

I figured out that now we have to pass the string name, to nlp.add_pipe but how?

I've tried nlp.add_pipe("langdetect"), nlp.add_pipe("LanguageDetector"),nlp.add_pipe("languagedetector") and none of them seems to work.

Can you help me with this ?

Answer 1 · 2021-03-12T07:48:56.000Z

Hi,

Since I'm new to SpaCy and Python, I'm not sure if this is the correct way to implement it. For Python 3.9 with SpaCy 3.0.3 the following worked for me:

import spacy
from spacy.language import Language
from spacy_langdetect import LanguageDetector

# Add LanguageDetector and assign it a string name
@Language.factory("language_detector")
def create_language_detector(nlp, name):
    return LanguageDetector(language_detection_function=None)

# Use a blank Pipeline, also a model can be used, e.g. nlp = spacy.load("en_core_web_sm")
nlp = spacy.blank("en")

# Add sentencizer for longer text
nlp.add_pipe('sentencizer')

# Add components using their string names
nlp.add_pipe("language_detector")

# Analyze components and their attributes
text = "This is an English text."
doc = nlp(text)

# Document level language detection.
print(doc._.language)

# See what happened to the pipes
nlp.analyze_pipes(pretty=True)`

I got on this track with: Language-specific pipeline

Is this the right way to use it with SpaCy3?

How to use the result for language specific processing?
Do I have to load language specific models, e.g.
nlp_en = spacy.load("en_core_web_sm") and
nlp_de = spacy.load("de_core_news_sm")?

Many thanks and best regards,

Cusard

Answer 2 · 2021-04-01T21:40:59.000Z

same problem

Answer 3 · 2021-04-15T13:54:36.000Z

Hello everybody!
Thanks to @Cusard I got the example code to work with the current spacy version.

import spacy
from spacy.language import Language
from spacy_langdetect import LanguageDetector

@Language.factory("language_detector")
def create_language_detector(nlp, name):
    return LanguageDetector(language_detection_function=None)

nlp = spacy.load("en_core_web_sm")

nlp.add_pipe('language_detector')
text = 'This is an english text.'
doc = nlp(text)
# document level language detection. Think of it like average language of the document!
print(doc._.language)
# sentence level language detection
for sent in doc.sents:
   print(sent, sent._.language)

The output looks like this:

{'language': 'en', 'score': 0.9999983570159962}
This is an english text. {'language': 'en', 'score': 0.9999956329695125}

Answer 4 · 2021-07-30T12:31:55.000Z

Thanks for sharing the solution. It worked for me too.

It will be nice if the project home page had the example update: https://spacy.io/universe/project/spacy-langdetect

Answer 5 · 2022-06-27T21:55:14.000Z

The example provided by @FelixSiegfriedRiedel works for me with v3.3.

I've also raised an issue about updating the documentation: explosion/spaCy#11038