aoldoni/tetre

Difference in dependency parsing

Opened this issue · 6 comments

Sentence "A two-tier scheme (Pang and Lee, 2004) where sentences are first classified as subjective versus objective , and then applying the sentiment classifier on only the subjective sentences further improves performance." gives different tree locally than in displacy.

Logged issue in Spacy Github explosion/spaCy#480

Hi Alisson,

I am away on a trip and will return on 5 Oct. So for this week, I cannot
fix my schedule till Tue. In the worst case, let's communicate by email.

Wei
On 25 Sep 2016 16:35, "Alisson Oldoni" notifications@github.com wrote:

Logged issue in Spacy Github explosion/spaCy#480
explosion/spaCy#480


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#20 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ALpj2pxRe04GdNcZZ3JnNfT-vCw0YI3uks5qtjJNgaJpZM4KF2TM
.

One thing we could attempt here is for maybe for Dr. @DBWangUNSW to please attempt parsing a sentence locally and see if yours is the same as displacy, then we can narrow to be a issue with my local computer.

E.g.: this sentence: "A two-tier scheme (Pang and Lee, 2004) where sentences are first classified as subjective versus objective , and then applying the sentiment classifier on only the subjective sentences further improves performance."

Thanks.

Related with #21 since moving to "Stanford" e.g.: could entirely avoiding this problem, while being slower.

After some investigation, it is quite challenging to add the stanford parse properly in python, but it is something I can continue to work on.

Other sentence parsed incorrectly:

  • "As remarked earlier, an automated reasoning facility effectively reduces manual effort and improves match quality and efficiency."
  • "A two-tier scheme (Pang and Lee, 2004) where sentences are first classified as subjective versus objective , and then applying the sentiment classifier on only the subjective sentences further improves performance."
  • "Consequently, IPL improves read performance by reducing the number of log pages to read from flash memory when recreating a logical page because log pages do not increase indefinitely (i.e., is bound) due to merging."
  • "It has been shown that using logic-oriented categorical features not only significantly improves the efficiency of sequences matching but also achieves high accuracy in contentbased retrieval of human-motion data."
  • "A subsequent work is PWJoin [7] that improves the performance of PJoin utilizing both the window semantics and punctuation semantics."
  • "A two-tier scheme (Pang and Lee, 2004) where sentences are first classified as subjective versus objective , and then applying the sentiment classifier on only the subjective sentences further improves performance."
  • "(2008) report that just filtering the phrase table by the socalled well-formed target dependency structure does not help, yet adding a target dependency language model improves performance significantly."