PreProcessing

Question

PreProcessing

pratikghanwat7 opened this issue 4 years ago · 2 comments

Hello,
Are you doing any kind of preprocessing on input text? such as stopwords removal, tokenization, lemmatize, or any other text cleaning process?

Answer 1 · 2020-11-25T14:18:54.000Z

In the basic setup, I am not doing any preprocessing. In some of my research over a year ago, I looked into that with different spacy operations, but results were largely inconclusive (I also didn't spend a lot of time on it). With the current library, this could be done by a custom SentenceHandler.

Answer 2 · 2020-11-25T15:03:51.000Z

I have used your model, it works perfectly for small sentences but it kinda breaks with larger documents. I wanted to give you more insight but right now I am busy with a project and return to you with a detailed explanation later.