stanfordnlp/CoreNLP

Error with CoreNPL: WARNING: Could not create JAXB context using the current threads context classloader. Defaulting to ObjectFactory classloader. Exception in thread "main" edu.stanford.nlp.util.ReflectionLoading$ReflectionLoadingException: Error creating edu.stanford.nlp.time.TimeExpressionExtractorImpl

JaimeC98 opened this issue · 11 comments

Hi,
I've got this error when I tried to execute CoreNPL with an extractor of cibersecurity entities:
image

Does anyone know how to fix it? Thanks.

It is Java 8. This the error:

Registering annotator cyberentity with class gov.ornl.stucco.entity.CyberEntityAnnotator
Registering annotator cyberheuristics with class gov.ornl.stucco.entity.heuristics.CyberHeuristicAnnotator
Adding annotator tokenize
TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
Adding annotator ssplit
edu.stanford.nlp.pipeline.AnnotatorImplementations:
Adding annotator pos
Reading POS tagger model from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [0,7 sec].
Adding annotator cyberheuristics
Loading sw_products list from 'dictionaries/software_info.json'
Loading sw_vendors list from 'dictionaries/software_developers.json'
Loading sw_products (os) list from 'dictionaries/operating_systems.json'
Loading vuln_description list from 'dictionaries/relevant_terms.txt'
Token-to-Label map loaded from 'dictionaries/token_label_map.ser'
Loading regular expresions ...
Adding annotator cyberentity
Loading model from 'models/ORNL-perceptron.bin'
Adding annotator lemma
Adding annotator ner
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [1,8 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [1,2 sec].
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [1,4 sec].
sutime.binder.1.
Initializing JollyDayHoliday for sutime with classpath:edu/stanford/nlp/models/sutime/jollyday/Holidays_sutime.xml
mar. 27, 2023 9:37:14 A. M. de.jollyday.util.XMLUtil unmarshallConfiguration
WARNING: Could not create JAXB context using the current threads context classloader. Defaulting to ObjectFactory classloader.
Exception in thread "main" edu.stanford.nlp.util.ReflectionLoading$ReflectionLoadingException: Error creating edu.stanford.nlp.time.TimeExpressionExtractorImpl
at edu.stanford.nlp.util.ReflectionLoading.loadByReflection(ReflectionLoading.java:40)
at edu.stanford.nlp.time.TimeExpressionExtractorFactory.create(TimeExpressionExtractorFactory.java:57)
at edu.stanford.nlp.time.TimeExpressionExtractorFactory.createExtractor(TimeExpressionExtractorFactory.java:38)
at edu.stanford.nlp.ie.regexp.NumberSequenceClassifier.(NumberSequenceClassifier.java:79)
at edu.stanford.nlp.ie.NERClassifierCombiner.(NERClassifierCombiner.java:68)
at edu.stanford.nlp.pipeline.AnnotatorImplementations.ner(AnnotatorImplementations.java:99)
at edu.stanford.nlp.pipeline.StanfordCoreNLP$6.create(StanfordCoreNLP.java:627)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:292)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.(StanfordCoreNLP.java:129)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.(StanfordCoreNLP.java:125)
at gov.ornl.stucco.entity.EntityLabeler.(EntityLabeler.java:26)
at gov.ornl.stucco.prueba.Prueba.main(Prueba.java:19)
Caused by: edu.stanford.nlp.util.MetaClass$ClassCreationException: MetaClass couldn't create public edu.stanford.nlp.time.TimeExpressionExtractorImpl(java.lang.String,java.util.Properties) with args [sutime, {customAnnotatorClass.cyberentity=gov.ornl.stucco.entity.CyberEntityAnnotator, annotators=tokenize, ssplit, pos, cyberheuristics, cyberentity, lemma, ner, parse, customAnnotatorClass.cyberheuristics=gov.ornl.stucco.entity.heuristics.CyberHeuristicAnnotator}]
at edu.stanford.nlp.util.MetaClass$ClassFactory.createInstance(MetaClass.java:233)
at edu.stanford.nlp.util.MetaClass.createInstance(MetaClass.java:378)
at edu.stanford.nlp.util.ReflectionLoading.loadByReflection(ReflectionLoading.java:38)
... 12 more
Caused by: java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:64)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:481)
at edu.stanford.nlp.util.MetaClass$ClassFactory.createInstance(MetaClass.java:229)
... 14 more
Caused by: java.lang.RuntimeException: Error initializing binder 1
at edu.stanford.nlp.time.Options.(Options.java:92)
at edu.stanford.nlp.time.TimeExpressionExtractorImpl.init(TimeExpressionExtractorImpl.java:45)
at edu.stanford.nlp.time.TimeExpressionExtractorImpl.(TimeExpressionExtractorImpl.java:39)
... 20 more
Caused by: java.lang.IllegalStateException: Cannot instantiate configuration.
at de.jollyday.impl.XMLManager.init(XMLManager.java:286)
at de.jollyday.HolidayManager.createManager(HolidayManager.java:278)
at de.jollyday.HolidayManager.getInstance(HolidayManager.java:194)
at edu.stanford.nlp.time.JollyDayHolidays.init(JollyDayHolidays.java:52)
at edu.stanford.nlp.time.Options.(Options.java:90)
... 22 more
Caused by: java.lang.IllegalStateException: Cannot parse holidays XML file.
at de.jollyday.util.XMLUtil.unmarshallConfiguration(XMLUtil.java:80)
at de.jollyday.impl.XMLManager.init(XMLManager.java:284)
... 26 more
Caused by: javax.xml.bind.JAXBException: Provider com.sun.xml.internal.bind.v2.ContextFactory not found with linked exception:
[java.lang.ClassNotFoundException: com.sun.xml.internal.bind.v2.ContextFactory]
at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:148)
at javax.xml.bind.ContextFinder.find(ContextFinder.java:361)
at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:446)
at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:409)
at de.jollyday.util.XMLUtil$JAXBContextCreator.create(XMLUtil.java:172)
at de.jollyday.util.XMLUtil.unmarshallConfiguration(XMLUtil.java:73)
... 27 more
Caused by: java.lang.ClassNotFoundException: com.sun.xml.internal.bind.v2.ContextFactory
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:606)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:168)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
at javax.xml.bind.ContextFinder.safeLoadClass(ContextFinder.java:573)
at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:145)
... 32 more

The code I use is the one of the screenshot from stucco/entity-extractor proyect that implements CoreNPL:

package gov.ornl.stucco.prueba;
import java.util.List;
import edu.stanford.nlp.ling.CoreAnnotations.PartOfSpeechAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.SentencesAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.TextAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.TokensAnnotation;
import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.trees.TreeCoreAnnotations.TreeAnnotation;
import edu.stanford.nlp.util.CoreMap;
import gov.ornl.stucco.entity.CyberEntityAnnotator.CyberAnnotation;
import gov.ornl.stucco.entity.EntityLabeler;


public class Prueba {

	public static void main (String args[]) {
		EntityLabeler labeler = new EntityLabeler();
		Annotation doc = labeler.getAnnotatedDoc("My Doc", "Hello everyone, this is Fuseki.");

		List<CoreMap> sentences = doc.get(SentencesAnnotation.class);
		for ( CoreMap sentence : sentences) {
			for ( CoreLabel token : sentence.get(TokensAnnotation.class)) {
				System.out.println(token.get(TextAnnotation.class) + "\t" + token.get(PartOfSpeechAnnotation.class) + "\t" + token.get(CyberAnnotation.class));
			}
			
			System.out.println("Parse Tree:\n" + sentence.get(TreeAnnotation.class));			
		}
	}
}

I dont know if there is something that Ive forgotten to import about CoreNPL... If anyone knows what to do...

Ive followed this instructions to add a new classpath with the CoreNPL dependecies: In Eclipse, go to Window > Preferences > Java > Build Path > Classpath Variables:
image

And also Ive add the CoreNPL folder to the proyect. I dont know if this is to add CoreNPL to my classpath in Eclipse.

Ive tried to use CoreNLP without the stucco project on Windows through the symbol system. This is the command Ive used:

C:\Users\jaime\OneDrive\Escritorio>java edu.stanford.nlp.pipeline.StanfordCoreNLP -file input.txt
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Searching for resource: StanfordCoreNLP.properties ... found.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos
[main] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/english-left3words-distsim.tagger ... done [0.7 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
[main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [1.1 sec].
[main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [1.1 sec].
[main] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.4 sec].
[main] INFO edu.stanford.nlp.time.JollyDayHolidays - Initializing JollyDayHoliday for SUTime from classpath edu/stanford/nlp/models/sutime/jollyday/Holidays_sutime.xml as sutime.binder.1.
[main] INFO edu.stanford.nlp.time.TimeExpressionExtractorImpl - Using following SUTime rules: edu/stanford/nlp/models/sutime/defs.sutime.txt,edu/stanford/nlp/models/sutime/english.sutime.txt,edu/stanford/nlp/models/sutime/english.holidays.sutime.txt
[main] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 580705 unique entries out of 581864 from edu/stanford/nlp/models/kbp/english/gazetteers/regexner_caseless.tab, 0 TokensRegex patterns.
[main] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 4867 unique entries out of 4867 from edu/stanford/nlp/models/kbp/english/gazetteers/regexner_cased.tab, 0 TokensRegex patterns.
[main] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 585572 unique entries from 2 files
[main] INFO edu.stanford.nlp.pipeline.NERCombinerAnnotator - numeric classifiers: true; SUTime: true [no docDate]; fine grained: true
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator depparse
[main] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Loading depparse model: edu/stanford/nlp/models/parser/nndep/english_UD.gz ... Time elapsed: 1.0 sec
[main] INFO edu.stanford.nlp.parser.nndep.Classifier - PreComputed 20000 vectors, elapsed Time: 0.846 sec
[main] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Initializing dependency parser ... done [1.8 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator coref
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.io.ObjectInputStream$HandleTable.grow(Unknown Source)
at java.io.ObjectInputStream$HandleTable.assign(Unknown Source)
at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at java.util.HashMap.readObject(Unknown Source)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at java.io.ObjectStreamClass.invokeReadObject(Unknown Source)
at java.io.ObjectInputStream.readSerialData(Unknown Source)
at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
at java.io.ObjectInputStream.readSerialData(Unknown Source)
at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at edu.stanford.nlp.io.IOUtils.readObjectFromURLOrClasspathOrFileSystem(IOUtils.java:310)
at edu.stanford.nlp.coref.statistical.FeatureExtractor.loadVocabulary(FeatureExtractor.java:90)
at edu.stanford.nlp.coref.statistical.FeatureExtractor.(FeatureExtractor.java:75)
at edu.stanford.nlp.coref.statistical.StatisticalCorefAlgorithm.(StatisticalCorefAlgorithm.java:63)
at edu.stanford.nlp.coref.statistical.StatisticalCorefAlgorithm.(StatisticalCorefAlgorithm.java:44)
at edu.stanford.nlp.coref.CorefAlgorithm.fromProps(CorefAlgorithm.java:30)
at edu.stanford.nlp.coref.CorefSystem.(CorefSystem.java:40)
at edu.stanford.nlp.pipeline.CorefAnnotator.(CorefAnnotator.java:69)
at edu.stanford.nlp.pipeline.AnnotatorImplementations.coref(AnnotatorImplementations.java:218)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$getNamedAnnotators$17(StanfordCoreNLP.java:641)
at edu.stanford.nlp.pipeline.StanfordCoreNLP$$Lambda$27/515132998.apply(Unknown Source)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.lambda$null$33(StanfordCoreNLP.java:711)

My classpath is:

C:\Users\jaime\OneDrive\Escritorio>set CLASSPATH
CLASSPATH=C:\Users\jaime\OneDrive\Escritorio\Universidad\Segundo Curso\TFM\Programas\stanford-corenlp-4.5.3/
*

It si a problem with the memory and CPU. When I execute the command my CPU explode with CoreNPL using it 100%. Only I tried to prove with an TXT with two words. Is this normal?

And, how I do that? Thanks.

CoreNPL works with a command now, but in Eclipse it still doesn't work. The project import de dependecies through Maven:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>gov.ornl.stucco</groupId>
  <artifactId>entity-extractor</artifactId>
  <version>1.0.0</version>
  <properties>
  	<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <maven.test.skip>true</maven.test.skip>
  	<main.class>gov.ornl.stucco.entity.EntityLabeler</main.class>
  </properties>
  <dependencies>
  	<dependency>
	    <groupId>edu.stanford.nlp</groupId>
	    <artifactId>stanford-corenlp</artifactId>
	    <version>3.5.1</version>
	</dependency>
	<dependency>
	    <groupId>edu.stanford.nlp</groupId>
	    <artifactId>stanford-corenlp</artifactId>
	    <version>3.5.1</version>
	    <classifier>models</classifier>
	</dependency>
  	<dependency>
  		<groupId>org.apache.opennlp</groupId>
  		<artifactId>opennlp-tools</artifactId>
  		<version>1.6.0</version>
	</dependency>
	<dependency>
  		<groupId>com.fasterxml.jackson.core</groupId>
  		<artifactId>jackson-databind</artifactId>
  		<version>2.7.0</version>
	</dependency>
	<dependency>
        <groupId>junit</groupId>
        <artifactId>junit</artifactId>
        <version>4.8.1</version>
        <scope>test</scope>
    </dependency>
  </dependencies>
  <build>
    <resources>
		<resource>
			<directory>src/main/resources</directory>
			<includes>
				<include>models/ORNL-perceptron.bin</include>
				<include>dictionaries/operating_systems.json</include>
				<include>dictionaries/relevant_terms.txt</include>
				<include>dictionaries/software_developers.json</include>
				<include>dictionaries/software_info.json</include>
				<include>dictionaries/token_label_map.ser</include>
			</includes>
		</resource>
	</resources>
    <plugins>
		<plugin>
			<groupId>org.apache.maven.plugins</groupId>
			<artifactId>maven-compiler-plugin</artifactId>
            <version>3.1</version>
			<configuration>
				<source>1.7</source>
				<target>1.7</target>
				<showDeprecation>true</showDeprecation>
				<showWarnings>true</showWarnings>
				<fork>true</fork>
			</configuration>
            <executions>
                 <execution>
                     <phase>compile</phase>
                     <goals>
                        <goal>compile</goal>
                     </goals>
                  </execution>
            </executions>
		</plugin>
		<plugin>
        <groupId>org.codehaus.mojo</groupId>
        <artifactId>exec-maven-plugin</artifactId>
        <version>1.2.1</version>
        <executions>
          <execution>
            <goals>
              <goal>exec</goal>
            </goals>
          </execution>
        </executions>
        <configuration>
          <executable>java</executable>
          <includeProjectDependencies>true</includeProjectDependencies>
          <includePluginDependencies>false</includePluginDependencies>
          <classpathScope>compile</classpathScope>
          <mainClass>${main.class}</mainClass>
        </configuration>
      </plugin>
     </plugins>
   </build>
</project>

I think with this it should work...

I don't really have any Eclipse specific answers, but I think it's odd that you want to use version 3.5.1 from 8 years ago