Implemetation takes really long for giving putput
raviranjan-innoplexus opened this issue · 1 comments
raviranjan-innoplexus commented
Hi, I am using this library but am getting extremely slow results. For 10k records containing some texts, it has taken longer than 16 hours to process 160 tasks out of 1920 after re-partitioning. I am wonder if the name extraction is working parallely or do other executors queue one after the other for name entity recognition to happen. Python non-parallel scripts seem to work faster than this. Any suggestion, work arounds would be highly appreciated
semantiDan commented
I'm experiencing the same extreme slowness when performing a benchmark against NLTK (Vader) and Spark-core (JohnSnow).
For 1 million rows of sentiment analysis:
- Spark-Core NLP (JohnSnow 1.6.3) finishes the job in 4 min 30 secs.
- NLTK (Vader) NLP finishes the job in 6 min 30 secs.
- Stanford-Core NLP never finishes the job, takes more than 1 hour.