MLBazaar/MLPrimitives

mlprimitives.custom.text.TextCleaner fails if text is empty

csala opened this issue · 0 comments

csala commented

When the collection of texts to clean contains an empty string "", the mlprimitives.custom.text.TextCleaner._remove_stopwords crashes.

In [1]: from mlprimitives.custom.text import TextCleaner                                                                                                                                                                                                                       

In [2]: cleaner = TextCleaner()                                                                                                                                                                                                                                                

In [3]: cleaner.produce(['not empty', ''])                                                                                                                                                                                                                                     
---------------------------------------------------------------------------
LangDetectException                       Traceback (most recent call last)
<ipython-input-3-342ec016e729> in <module>
----> 1 cleaner.produce(['not empty', ''])
...
~/.virtualenvs/MLPrimitives/lib/python3.6/site-packages/langdetect/detector.py in _detect_block(self)
    148         ngrams = self._extract_ngrams()
    149         if not ngrams:
--> 150             raise LangDetectException(ErrorCode.CantDetectError, 'No features in text.')
    151 
    152         self.langprob = [0.0] * len(self.langlist)

LangDetectException: No features in text.