UCDenver-ccp/CRAFT

Exception during conversion to coreference chains format

Closed this issue · 2 comments

First of all, thanks for this great contribution to the biomedical community.

After having set up the boot environment, I am unable to convert the corpus to the coreference chains format. I am getting the following exception. Please help me figure out where I am wrong.

boot coreference convert --conll-coref-ident -o coref_data_craft/

converting (:coreference) annotations to :conll-coref-ident ...
output directory: /home/ashim/CRAFT/coref_data_craft
                                                                  java.lang.Thread.run                        Thread.java:  748
                                    java.util.concurrent.ThreadPoolExecutor$Worker.run            ThreadPoolExecutor.java:  617
                                     java.util.concurrent.ThreadPoolExecutor.runWorker            ThreadPoolExecutor.java: 1142
                                                   java.util.concurrent.FutureTask.run                    FutureTask.java:  266
                                                                                   ...                                         
                                                   clojure.core/binding-conveyor-fn/fn                           core.clj: 1938
                                                                     boot.core/boot/fn                           core.clj: 1031
                                                                   boot.core/run-tasks                           core.clj: 1021
                                          boot.user$eval61$fn__62$fn__67$fn__68.invoke                                   :   73
                                      boot.user$eval325$fn__326$fn__331$fn__332.invoke                                   :  305
                                                                    clojure.core/doall                           core.clj: 3039
                                                                    clojure.core/dorun                           core.clj: 3024
                                                                      clojure.core/seq                           core.clj:  137
                                                                                   ...                                         
                                                                   clojure.core/map/fn                           core.clj: 2646
                              boot.user$eval325$fn__326$fn__331$fn__332$fn__333.invoke                                   :  317
                          edu.ucdenver.ccp.file.conversion.FileFormatConverter.convert           FileFormatConverter.java:  112
                          edu.ucdenver.ccp.file.conversion.FileFormatConverter.convert           FileFormatConverter.java:   77
                             edu.ucdenver.ccp.file.conversion.DocumentWriter.serialize                DocumentWriter.java:   47
edu.ucdenver.ccp.file.conversion.conllcoref2012.CoNLLCoref2012DocumentWriter.serialize  CoNLLCoref2012DocumentWriter.java:  147
          edu.ucdenver.ccp.file.conversion.conllu.CoNLLUDocumentWriter.generateRecords          CoNLLUDocumentWriter.java:  104
    edu.ucdenver.ccp.file.conversion.conllu.CoNLLUDocumentWriter.groupTokensBySentence          CoNLLUDocumentWriter.java:  179
java.lang.IllegalArgumentException: Cannot group tokens by sentence without any sentence annotations.
        clojure.lang.ExceptionInfo: Cannot group tokens by sentence without any sentence annotations.
    line: 515

Thanks in advance,
Ashim

Because the CoNLL Coref format requires tokens and sentences, the conversion process also requires tokens and sentences to be part of the input. The part-of-speech annotations contain both token and sentence boundaries, so adding that to your command should fix things. Also, the output directory should be an absolute path. Try the following (replacing /path/to/ with the appropriate path on your system):
boot part-of-speech coreference convert --conll-coref-ident -o /path/to/coref_data_craft/

Let me know if you have further issues.

Best,
Bill

Thanks a lot. It works now.