TimSchopf/KeyphraseVectorizers

ValueError: Transformation generated invalid chunkstring

Kowsalya-Mouttouramane opened this issue · 2 comments

image
test = ["Les voitures autonomes déplacent la responsabilité de l'assurance vers les constructeurs"]
vectorizer_fr = KeyphraseCountVectorizer(spacy_pipeline='fr_dep_news_trf', pos_pattern='<N.*>+', stop_words ='french')
vectorizer_fr.fit(test)

It generates a valueError : Transformation generated invalid chunkstring:
<><><><><><><><><><><><>

This works with other languages, the problem is only with the french spacy models (whatever french model).
Can anyone help me solve this error, please ?

Hi @Kowsalya-Mouttouramane,
You can check issue #2. In case of the French pipeline, you need to add the transformer pipeline component.
You can use my fork or create a pull request to activate custom components.

Closing this as duplicate to issue #2