CogComp/cogcomp-nlp

Problem Running NER/Downloading Gazetteer

ShuhengL opened this issue · 0 comments

Hi! I am trying to use the runNER.sh script to annotate data, as well as use the runBenchmark.sh script in the downloaded version of illinois-ner to evaluate the model on CoNLL-2003 test set. However, when running both scripts, I encountered the following issue where it appeared to be the case that the gazetteer cannot be downloaded. The error log after running the runNER.sh is pasted below:

log4j:WARN No appenders could be found for logger (edu.illinois.cs.cogcomp.ner.LbjTagger.Parameters). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. Downloading the folder from datastore . . . GroupId: readonly.org.cogcomp.gazetteers ArtifactId: 1.6/gazetteers.zip augmentedGroupId: readonly.org.cogcomp.gazetteers versionedFileName: 1.6/gazetteers.zip zippedFileName: ?/.cogcomp-datastore/tmp/1.6/gazetteers.zip java.net.SocketTimeoutException: connect timed out at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:476) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:218) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:394) at java.net.Socket.connect(Socket.java:606) at com.squareup.okhttp.internal.Platform.connectSocket(Platform.java:101) at com.squareup.okhttp.internal.io.RealConnection.connectSocket(RealConnection.java:137) at com.squareup.okhttp.internal.io.RealConnection.connect(RealConnection.java:108) at com.squareup.okhttp.internal.http.StreamAllocation.findConnection(StreamAllocation.java:184) at com.squareup.okhttp.internal.http.StreamAllocation.findHealthyConnection(StreamAllocation.java:126) at com.squareup.okhttp.internal.http.StreamAllocation.newStream(StreamAllocation.java:95) at com.squareup.okhttp.internal.http.HttpEngine.connect(HttpEngine.java:281) at com.squareup.okhttp.internal.http.HttpEngine.sendRequest(HttpEngine.java:224) at com.squareup.okhttp.Call.getResponse(Call.java:286) at com.squareup.okhttp.Call$ApplicationInterceptorChain.proceed(Call.java:243) at com.squareup.okhttp.Call.getResponseWithInterceptorChain(Call.java:205) at com.squareup.okhttp.Call.execute(Call.java:80) at io.minio.MinioClient.execute(MinioClient.java:826) at io.minio.MinioClient.executeHead(MinioClient.java:1018) at io.minio.MinioClient.statObject(MinioClient.java:1154) at io.minio.MinioClient.getObject(MinioClient.java:1343) at org.cogcomp.Datastore.getDirectory(Datastore.java:556) at edu.illinois.cs.cogcomp.ner.ExpressiveFeatures.TreeGazetteers.init(TreeGazetteers.java:71) at edu.illinois.cs.cogcomp.ner.ExpressiveFeatures.TreeGazetteers.<init>(TreeGazetteers.java:50) at edu.illinois.cs.cogcomp.ner.ExpressiveFeatures.GazetteersFactory.get(GazetteersFactory.java:50) at edu.illinois.cs.cogcomp.ner.LbjTagger.Parameters.readAndLoadConfig(Parameters.java:265) at edu.illinois.cs.cogcomp.ner.LbjTagger.Parameters.readConfigAndLoadExternalData(Parameters.java:56) at edu.illinois.cs.cogcomp.ner.NERAnnotator.initialize(NERAnnotator.java:119) at edu.illinois.cs.cogcomp.annotation.Annotator.doInitialize(Annotator.java:126) at edu.illinois.cs.cogcomp.annotation.Annotator.lazyAddView(Annotator.java:201) at edu.illinois.cs.cogcomp.annotation.Annotator.getView(Annotator.java:167) at edu.illinois.cs.cogcomp.ner.Main.processInputFile(Main.java:544) at edu.illinois.cs.cogcomp.ner.Main.execute(Main.java:392) at edu.illinois.cs.cogcomp.ner.Main.processCommand(Main.java:168) at edu.illinois.cs.cogcomp.ner.AbstractMain.run(AbstractMain.java:97) java.io.FileNotFoundException: ?/.cogcomp-datastore/tmp/1.6/gazetteers.zip (No such file or directory) [60/1887] at java.io.FileInputStream.open0(Native Method) at java.io.FileInputStream.open(FileInputStream.java:195) at java.io.FileInputStream.<init>(FileInputStream.java:138) at java.io.FileInputStream.<init>(FileInputStream.java:93) at org.cogcomp.ZipHelper.unZipIt(ZipHelper.java:71) at org.cogcomp.Datastore.getDirectory(Datastore.java:585) at edu.illinois.cs.cogcomp.ner.ExpressiveFeatures.TreeGazetteers.init(TreeGazetteers.java:71) at edu.illinois.cs.cogcomp.ner.ExpressiveFeatures.TreeGazetteers.<init>(TreeGazetteers.java:50) at edu.illinois.cs.cogcomp.ner.ExpressiveFeatures.GazetteersFactory.get(GazetteersFactory.java:50) at edu.illinois.cs.cogcomp.ner.LbjTagger.Parameters.readAndLoadConfig(Parameters.java:265) at edu.illinois.cs.cogcomp.ner.LbjTagger.Parameters.readConfigAndLoadExternalData(Parameters.java:56) at edu.illinois.cs.cogcomp.ner.NERAnnotator.initialize(NERAnnotator.java:119) at edu.illinois.cs.cogcomp.annotation.Annotator.doInitialize(Annotator.java:126) at edu.illinois.cs.cogcomp.annotation.Annotator.lazyAddView(Annotator.java:201) at edu.illinois.cs.cogcomp.annotation.Annotator.getView(Annotator.java:167) at edu.illinois.cs.cogcomp.ner.Main.processInputFile(Main.java:544) at edu.illinois.cs.cogcomp.ner.Main.execute(Main.java:392) at edu.illinois.cs.cogcomp.ner.Main.processCommand(Main.java:168) at edu.illinois.cs.cogcomp.ner.AbstractMain.run(AbstractMain.java:97) zippedFileName: ?/.cogcomp-datastore/tmp/1.6/gazetteers.zip path: ?/.cogcomp-datastore/readonly.org.cogcomp.gazetteers/1.6/gazetteers artifactId: gazetteers java.io.FileNotFoundException: ?/.cogcomp-datastore/readonly.org.cogcomp.gazetteers/1.6/gazetteers/gazetteers/gazetteers-list.txt (No s uch file or directory) at java.io.FileInputStream.open0(Native Method) at java.io.FileInputStream.open(FileInputStream.java:195) at java.io.FileInputStream.<init>(FileInputStream.java:138) at java.io.FileInputStream.<init>(FileInputStream.java:93) at edu.illinois.cs.cogcomp.ner.ExpressiveFeatures.TreeGazetteers.init(TreeGazetteers.java:72) at edu.illinois.cs.cogcomp.ner.ExpressiveFeatures.TreeGazetteers.<init>(TreeGazetteers.java:50) at edu.illinois.cs.cogcomp.ner.ExpressiveFeatures.GazetteersFactory.get(GazetteersFactory.java:50) at edu.illinois.cs.cogcomp.ner.LbjTagger.Parameters.readAndLoadConfig(Parameters.java:265) at edu.illinois.cs.cogcomp.ner.LbjTagger.Parameters.readConfigAndLoadExternalData(Parameters.java:56) at edu.illinois.cs.cogcomp.ner.NERAnnotator.initialize(NERAnnotator.java:119) at edu.illinois.cs.cogcomp.annotation.Annotator.doInitialize(Annotator.java:126) at edu.illinois.cs.cogcomp.annotation.Annotator.lazyAddView(Annotator.java:201) at edu.illinois.cs.cogcomp.annotation.Annotator.getView(Annotator.java:167) at edu.illinois.cs.cogcomp.ner.Main.processInputFile(Main.java:544) at edu.illinois.cs.cogcomp.ner.Main.execute(Main.java:392) at edu.illinois.cs.cogcomp.ner.Main.processCommand(Main.java:168) at edu.illinois.cs.cogcomp.ner.AbstractMain.run(AbstractMain.java:97) Downloading the folder from datastore . . . GroupId: readonly.edu.illinois.cs.cogcomp.ner ArtifactId: 4.0/ner-model-enron-conll-all-data.zip augmentedGroupId: readonly.edu.illinois.cs.cogcomp.ner versionedFileName: 4.0/ner-model-enron-conll-all-data.zip zippedFileName: ?/.cogcomp-datastore/tmp/4.0/ner-model-enron-conll-all-data.zip java.net.SocketTimeoutException: connect timed out [12/1887] at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:476) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:218) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:394) at java.net.Socket.connect(Socket.java:606) at com.squareup.okhttp.internal.Platform.connectSocket(Platform.java:101) at com.squareup.okhttp.internal.io.RealConnection.connectSocket(RealConnection.java:137) at com.squareup.okhttp.internal.io.RealConnection.connect(RealConnection.java:108) at com.squareup.okhttp.internal.http.StreamAllocation.findConnection(StreamAllocation.java:184) at com.squareup.okhttp.internal.http.StreamAllocation.findHealthyConnection(StreamAllocation.java:126) at com.squareup.okhttp.internal.http.StreamAllocation.newStream(StreamAllocation.java:95) at com.squareup.okhttp.internal.http.HttpEngine.connect(HttpEngine.java:281) at com.squareup.okhttp.internal.http.HttpEngine.sendRequest(HttpEngine.java:224) at com.squareup.okhttp.Call.getResponse(Call.java:286) at com.squareup.okhttp.Call$ApplicationInterceptorChain.proceed(Call.java:243) at com.squareup.okhttp.Call.getResponseWithInterceptorChain(Call.java:205) at com.squareup.okhttp.Call.execute(Call.java:80) at io.minio.MinioClient.execute(MinioClient.java:826) at io.minio.MinioClient.executeHead(MinioClient.java:1018) at io.minio.MinioClient.statObject(MinioClient.java:1154) at io.minio.MinioClient.getObject(MinioClient.java:1343) at org.cogcomp.Datastore.getDirectory(Datastore.java:556) at edu.illinois.cs.cogcomp.ner.ModelLoader.load(ModelLoader.java:104) at edu.illinois.cs.cogcomp.ner.NERAnnotator.initialize(NERAnnotator.java:123) at edu.illinois.cs.cogcomp.annotation.Annotator.doInitialize(Annotator.java:126) at edu.illinois.cs.cogcomp.annotation.Annotator.lazyAddView(Annotator.java:201) at edu.illinois.cs.cogcomp.annotation.Annotator.getView(Annotator.java:167) at edu.illinois.cs.cogcomp.ner.Main.processInputFile(Main.java:544) at edu.illinois.cs.cogcomp.ner.Main.execute(Main.java:392) at edu.illinois.cs.cogcomp.ner.Main.processCommand(Main.java:168) at edu.illinois.cs.cogcomp.ner.AbstractMain.run(AbstractMain.java:97) java.io.FileNotFoundException: ?/.cogcomp-datastore/tmp/4.0/ner-model-enron-conll-all-data.zip (No such file or directory) at java.io.FileInputStream.open0(Native Method) at java.io.FileInputStream.open(FileInputStream.java:195) at java.io.FileInputStream.<init>(FileInputStream.java:138) at java.io.FileInputStream.<init>(FileInputStream.java:93) at org.cogcomp.ZipHelper.unZipIt(ZipHelper.java:71) at org.cogcomp.Datastore.getDirectory(Datastore.java:585) at edu.illinois.cs.cogcomp.ner.ModelLoader.load(ModelLoader.java:104) at edu.illinois.cs.cogcomp.ner.NERAnnotator.initialize(NERAnnotator.java:123) at edu.illinois.cs.cogcomp.annotation.Annotator.doInitialize(Annotator.java:126) at edu.illinois.cs.cogcomp.annotation.Annotator.lazyAddView(Annotator.java:201) at edu.illinois.cs.cogcomp.annotation.Annotator.getView(Annotator.java:167) at edu.illinois.cs.cogcomp.ner.Main.processInputFile(Main.java:544) at edu.illinois.cs.cogcomp.ner.Main.execute(Main.java:392) at edu.illinois.cs.cogcomp.ner.Main.processCommand(Main.java:168) at edu.illinois.cs.cogcomp.ner.AbstractMain.run(AbstractMain.java:97) zippedFileName: ?/.cogcomp-datastore/tmp/4.0/ner-model-enron-conll-all-data.zip path: ?/.cogcomp-datastore/readonly.edu.illinois.cs.cogcomp.ner/4.0/ner-model-enron-conll-all-data artifactId: ner-model-enron-conll-all-data java.lang.IllegalArgumentException: View NER_CONLL not found at edu.illinois.cs.cogcomp.core.datastructures.textannotation.AbstractTextAnnotation.getView(AbstractTextAnnotation.java:134) at edu.illinois.cs.cogcomp.annotation.Annotator.getView(Annotator.java:168) at edu.illinois.cs.cogcomp.ner.Main.processInputFile(Main.java:544) at edu.illinois.cs.cogcomp.ner.Main.execute(Main.java:392) at edu.illinois.cs.cogcomp.ner.Main.processCommand(Main.java:168) at edu.illinois.cs.cogcomp.ner.AbstractMain.run(AbstractMain.java:97)

I have already compiled the code as described in README and everything was built successfully. May I ask how this issue can be resolved, or if this is due to an expired link somewhere? Thank you!