Implemetation takes really long for giving putput

Question

Implemetation takes really long for giving putput

raviranjan-innoplexus opened this issue 8 years ago · 1 comments

raviranjan-innoplexus commented 8 years ago

Hi, I am using this library but am getting extremely slow results. For 10k records containing some texts, it has taken longer than 16 hours to process 160 tasks out of 1920 after re-partitioning. I am wonder if the name extraction is working parallely or do other executors queue one after the other for name entity recognition to happen. Python non-parallel scripts seem to work faster than this. Any suggestion, work arounds would be highly appreciated

Answer 1 · 2018-09-25T13:21:45.000Z

I'm experiencing the same extreme slowness when performing a benchmark against NLTK (Vader) and Spark-core (JohnSnow).

For 1 million rows of sentiment analysis:

Spark-Core NLP (JohnSnow 1.6.3) finishes the job in 4 min 30 secs.
NLTK (Vader) NLP finishes the job in 6 min 30 secs.
Stanford-Core NLP never finishes the job, takes more than 1 hour.