Difference in dependency parsing
Opened this issue · 6 comments
Sentence "A two-tier scheme (Pang and Lee, 2004) where sentences are first classified as subjective versus objective , and then applying the sentiment classifier on only the subjective sentences further improves performance." gives different tree locally than in displacy.
Logged issue in Spacy Github explosion/spaCy#480
Hi Alisson,
I am away on a trip and will return on 5 Oct. So for this week, I cannot
fix my schedule till Tue. In the worst case, let's communicate by email.
Wei
On 25 Sep 2016 16:35, "Alisson Oldoni" notifications@github.com wrote:
Logged issue in Spacy Github explosion/spaCy#480
explosion/spaCy#480—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#20 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ALpj2pxRe04GdNcZZ3JnNfT-vCw0YI3uks5qtjJNgaJpZM4KF2TM
.
One thing we could attempt here is for maybe for Dr. @DBWangUNSW to please attempt parsing a sentence locally and see if yours is the same as displacy, then we can narrow to be a issue with my local computer.
E.g.: this sentence: "A two-tier scheme (Pang and Lee, 2004) where sentences are first classified as subjective versus objective , and then applying the sentiment classifier on only the subjective sentences further improves performance."
Thanks.
Related with #21 since moving to "Stanford" e.g.: could entirely avoiding this problem, while being slower.
After some investigation, it is quite challenging to add the stanford parse properly in python, but it is something I can continue to work on.
Other sentence parsed incorrectly:
- "As remarked earlier, an automated reasoning facility effectively reduces manual effort and improves match quality and efficiency."
- "A two-tier scheme (Pang and Lee, 2004) where sentences are first classified as subjective versus objective , and then applying the sentiment classifier on only the subjective sentences further improves performance."
- "Consequently, IPL improves read performance by reducing the number of log pages to read from flash memory when recreating a logical page because log pages do not increase indefinitely (i.e., is bound) due to merging."
- "It has been shown that using logic-oriented categorical features not only significantly improves the efficiency of sequences matching but also achieves high accuracy in contentbased retrieval of human-motion data."
- "A subsequent work is PWJoin [7] that improves the performance of PJoin utilizing both the window semantics and punctuation semantics."
- "A two-tier scheme (Pang and Lee, 2004) where sentences are first classified as subjective versus objective , and then applying the sentiment classifier on only the subjective sentences further improves performance."
- "(2008) report that just filtering the phrase table by the socalled well-formed target dependency structure does not help, yet adding a target dependency language model improves performance significantly."