Error while looking for alignments
Closed this issue · 11 comments
Hello Damien,
I'm doing the evaluation using GUI and for some terms an error occurs.
It doesn't explain what went wrong.
The error occured when treating "résine" (French). The alignments in the screenshot are made for some other term. The system didn't find any translations for "résine".
I checked the data and found that English term "resin" is in the term list:
For some other terms, the system suggests correct translations, so it is not a general issue.
Thank you in advance!
Sincerely yours,
Yuliya
Hi Yuliya,
You should find the detailed stack trace in the logs. termsuite-*.log
logs can be found in your TermSuite install dir, but most of the time, the actual stack trace is only found in another log file located at workspace/.metadata/.log
in you TermSuite install (I don't really understand why).
Don't be scared if you see many stack traces in that last log file. Most of them are Eclipse RCP silent bugs, i.e. not TermSuite bugs.
Can you tell me if you find the error stack trace in these files ?
Thx
Aha, I wouldn't guess to look there :)
You are right, it's a Java error :
!ENTRY fr.univnantes.termsuite.ui 4 0 2016-11-23 11:18:08.196
!MESSAGE An error occurred during alignment
!STACK 0
java.lang.NullPointerException
at eu.project.ttc.utils.AlignerUtils.translateVector(AlignerUtils.java:75)
at eu.project.ttc.engines.BilingualAligner.translateWithDico(BilingualAligner.java:326)
at eu.project.ttc.engines.BilingualAligner.alignDistributional(BilingualAligner.java:150)
at eu.project.ttc.engines.BilingualAligner.align(BilingualAligner.java:219)
at fr.univnantes.termsuite.ui.services.impl.AlignmentServiceImpl.align(AlignmentServiceImpl.java:213)
at fr.univnantes.termsuite.ui.menu.AlignHandler$1.run(AlignHandler.java:71)
at org.eclipse.core.internal.jobs.Worker.run(Worker.java:55)
Ok, That is a bug. Tank you for reporting. Is it a bloking issue for you ?
I'll try to fix it soon anyway.
Damien
While comparing my test pipeline and the current one, I found only one difference: in the current (bugging) one, I enabled MWT in contexts. Can it cause the problem? Moreover, it seems that I used "SWT only" and "allow MWT" along :/
Yes it might be, but I have to further investigate and try to reproduce the error before I can tell definitely.
Regarding the contextualizer options, some of them will be depreciated because they are error prone and not efficient at all. Actually, everything regarding MWTs have been tested and proved not relevant for contexts. Please configure you contextualizer this way:
- Do not allow MWT in contexts,
- Do not compute context for MWT, i.e. compute contexts for SWT only.
The problem occured in 26 cases of 50, so it was quite an important issue. I fixed it by two steps:
- Modify the pipeline, ("SWT only" OR "allow MWT" option selected)
- Run on one corpus at a time (before, I've launched two corpora together)
Hope it was just a user behaviour bug :) I'm not really sure whether it was the fact of running two corpora together or combining two options.
Ooops, sorry. I closed the issue. You might still want to fix it.
I am pretty sure that the MWT configs were the issue.
In principle, running multiple corpora together should not affect anything.
I tried to have a look at it, but I could not find a significant instruction at line 75 in AlignerUtils (cf. your stack trace above) in termsuite-core-2.3.3.jar.
Could you please give me the exact version of TermSuite your are using ?
Thanks !
Hello! This is the info from the GUI:
Current version: fr.univnantes.termsuite.ui_2.3.1.201610061343 [123]
Fixed in TermSuite 3.0.
http://termsuite.github.io/#gui
Bilingual alignment has been improved and now support compound terms, multi-word term of size > 2, and neoclassical terms.
Best,
Damien