ClearTK/cleartk

Exception: ... type Annotation "was not declared in the XML type descriptor."

Closed this issue · 4 comments

Hi all, I am trying to get out a ctakes release that has updated dependencies, mostly for vulnerability remediation but also to bring the project into this decade. Everything was going swimmingly until I updated cleartk to version 3.0.0.

Describe the bug
CleartkExtractor.extractBetween(..) seems to generate a
CASRuntimeException JCas type "org.apache.uima.jcas.tcas.Annotation" used in Java code, but was not declared in the XML type descriptor.
...
at org.cleartk.ml.feature.extractor.CleartkExtractor.extractBetween(CleartkExtractor.java:117)

To Reproduce This involves upgrades to the ctakes 6.0.0-SNAPSHOT (main branch).

  1. Upgrade to java 17
  2. Upgrade to uima and uimafit 3.5.0
  3. Upgrade to jcasgen-maven-plugin 3.5.0
  4. Upgrade to cleartk 3.0.0 (from 2.0.0)
  5. mvn clean compile
  6. Run some code that uses CleartkExtractor

Expected behavior
Expected behavior is working execution of cleartk such as in ctakes 5.1.0, running java 8 and uima 2.4.0 and cleartk 2.0.0

Please complete the following information:

  • Windows 10 Pro, IntelliJ 2023.1.7 (Community)

Additional context
The ctakes code:

  private CleartkExtractor tokensBetween = new CleartkExtractor(
      BaseToken.class,
      new NamingExtractor1("BetweenMentions", coveredText),
      new FirstCovered(1),
      new LastCovered(1),
      new Bag(new Covered()));
...
List<Feature> features = new ArrayList<Feature>();
...
features.addAll(this.tokensBetween.extractBetween(jCas, arg1, arg2));

where arg1, arg2 are declared org.apache.uima.jcas.tcas.Annotation

A more complete stack trace is:

Caused by: org.apache.uima.cas.CASRuntimeException: JCas type "org.apache.uima.jcas.tcas.Annotation" used in Java code,  but was not declared in the XML type descriptor.
	at org.apache.uima.cas.impl.TypeSystemImpl.throwMissingUIMAtype(TypeSystemImpl.java:2739)
	at org.apache.uima.cas.impl.TypeSystemImpl.getJCasRegisteredType(TypeSystemImpl.java:2715)
	at org.apache.uima.cas.impl.FeatureStructureImplC.<init>(FeatureStructureImplC.java:227)
	at org.apache.uima.jcas.cas.TOP.<init>(TOP.java:93)
	at org.apache.uima.jcas.cas.AnnotationBase.<init>(AnnotationBase.java:86)
	at org.apache.uima.jcas.tcas.Annotation.<init>(Annotation.java:175)
	at org.cleartk.ml.feature.extractor.CleartkExtractor.extractBetween(CleartkExtractor.java:117)
	at org.apache.ctakes.relationextractor.ae.features.TokenFeaturesExtractor.extract(TokenFeaturesExtractor.java:104)
	at org.apache.ctakes.relationextractor.ae.features.TokenFeaturesExtractor.extract(TokenFeaturesExtractor.java:38)
	at org.apache.ctakes.relationextractor.ae.RelationExtractorAnnotator.process(RelationExtractorAnnotator.java:155)
	at org.apache.ctakes.relationextractor.ae.DegreeOfRelationExtractorAnnotator.process(DegreeOfRelationExtractorAnnotator.java:63)
	at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:50)

Thanks for any help that you can provide.

Most likely your code is using a JCas cover class before it ever creates a (J)CAS object. You can probably work around the issue by creating some dummy CAS object at the start of your application. I had to do this in a ClearTK unit test as well. The underlying problem is tricky and I have no solution for it yet: apache/uima-uimaj#234

Hi Richard, thank you for the guidance. I am a little confused because about 20 annotators run before the cleartk -using ae without any problem. I will look for something that might be related to the referenced bugs. Sean

  • The 2 lines before the call to CleartkExtractor.extractBetween(..) both call NamingExtractor1.extract(..) with the same arguments and they seem to have no problems.

@seanfinan see e4ab3cc#diff-af408d648da8b83b12b599b6697d0625012924e4c420489f501269471065daa6R26-R56

There I had to add this workaround because the test was failing.

I added CasCreationUtils.createCas(); to the constructor of my implementation of JCasCollectionReader_ImplBase and it worked! Thank you!
Now I just need to make sure that it is in all the rest of my readers ...