dlcs/elucidate-server

Problem with type list in creator

Opened this issue · 6 comments

In case of a list of multiple types for a creator of an annotation (WADM), it looks like the parsing of the type list fails and as a result the creator isn't created in the database, so it cannot be used for search.

Failing example input for annotation creation:

{
  "id" : "http://example.com/testanno12",
  "type" : "Annotation",
  "created" : "2017-12-15T10:42:27Z",
  "creator" : {
    "type" : ["https://github.com/anno4j/ns#Resource", "Software"],
    "name" : "Some software name"
  },
  "@context": "http://www.w3.org/ns/anno.jsonld"
}

Working example input:

{
  "id" : "http://example.com/testanno12",
  "type" : "Annotation",
  "created" : "2017-12-15T10:42:27Z",
  "creator" : {
    "type" : ["https://github.com/anno4j/ns#Resource"],
    "name" : "Some software name"
  },
  "@context": "http://www.w3.org/ns/anno.jsonld"
}

It works with a type list as long as the list only has one element. With two or more it fails.
Weirdly enough, from what I have tested, to me it looks like there is no creator put into the database if at least one of the following is true:

  • creator has a list of multiple types
  • body has a list of multiple types
  • target has a list of multiple types
  • selector has a list of multiple types

The incomplete storage of information about the annotation happens very quietly, besides missing search results there is nearly no indication about a failure, the server is responding happily with a 201 Created and the annotation itself does indeed get created.

In the postgres.log the following error occurred at some point (since I am not having time stamps there, I don't know when exactly and I am having a hard time reproducing something like this properly for all problematic cases stated above):

ERROR:  current transaction is aborted, commands ignored until end of transaction block
STATEMENT:  select 1
ERROR:  invalid input syntax for type json
DETAIL:  The input string ended unexpectedly.
CONTEXT:  JSON data, line 1: ["https://github.com/anno4j/ns#Resource"
STATEMENT:  SELECT * FROM annotation_creator_create($1, $2, $3, $4, $5, string_to_array($6, ','), string_to_array($7, ',')::jsonb[], string_to_array($8, ','), string_to_array($9, ',')::jsonb[], $10, string_to_array($11, ','), string_to_array($12, ',')::jsonb[], string_to_array($13, ','), string_to_array($14, ',')::jsonb[], string_to_array($15, ','), string_to_array($16, ',')::jsonb[])

So my best guess would be that when the type list string gets split, the first entry of the array is something like ["https://github.com/anno4j/ns#Resource", which is not valid json.

The error might effect other parts as well, the creator just happened to be the part where I've stumbled upon the problem.

Thanks for raising this @GermaineG, I'll have a look.

Looks like this was the result of a simple typo on my part. I've fixed it in 6d415a5.

Hi @GermaineG, this has been resolved in v.1.4.2.

Hi @danielgrant
sorry to bother you again about this issue, but I noticed that this is not completely fixed yet.

A type list in the selector still poses a problem, because then a NullPointerException is thrown, which leads to missing entries at least in the tables annotation_selector and in annotation_temporal.

Find below a more detailed description of the problem. I can also open a new issue for it if preferred, but I think it fits here somehow.

Steps:
POST an annotation with a selector

Example (partial):

"selector" : {
      "id" : "urn:anno4j:c8996e26-7fb0-4ad7-b097-40c42184d335",
      "type" : [ "SvgSelector", "https://github.com/anno4j/ns#Resource", "https://github.com/anno4j/ns#Selector" ],
      "dcterms:conformsTo" : "http://www.w3.org/TR/SVG/",
      "value" : "<svg xmlns=\"http://www.w3.org/2000/svg\"><rect x=\"60\" y=\"15\" width=\"2490\" height=\"3290\"/></svg>"
    }

What happens:

  • The annotation gets created with 201 HTML code ( 👍 )
  • the annotation can be retrieved via GET ( 👍 )
  • the annotation does not appear in a temporal search ( 👎 )
  • there is an error in the server logs:
AnnotationExtractorServiceImpl - An error occurred processing W3CAnnotation [com.digirati.elucidate.common.model.annotation.w3c.W3CAnnotation@7add164b]
java.lang.NullPointerException
	at com.digirati.elucidate.service.extractor.impl.AnnotationExtractorServiceImpl.createAnnotationCssSelectors(AnnotationExtractorServiceImpl.java:167)
	at com.digirati.elucidate.service.extractor.impl.AnnotationExtractorServiceImpl.createAnnotationSelectors(AnnotationExtractorServiceImpl.java:155)
	at com.digirati.elucidate.service.extractor.impl.AnnotationExtractorServiceImpl.createAnnotationTargets(AnnotationExtractorServiceImpl.java:147)
	at com.digirati.elucidate.service.extractor.impl.AnnotationExtractorServiceImpl.processAnnotationCreate(AnnotationExtractorServiceImpl.java:60)
	at com.digirati.elucidate.infrastructure.listener.AnnotationExtractorRegisteredListenerImpl.notifyCreate(AnnotationExtractorRegisteredListenerImpl.java:24)
	at com.digirati.elucidate.infrastructure.aspect.AnnotationListenerAspect.lambda$afterCreate$0(AnnotationListenerAspect.java:101)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

Source of the problem:

The selector in the given example has more than one type (as you can see, the additional types are created by the anno4j library in this case).
The function extractSelectors returns null because of that.

The problem in the source code can be found in main.java.com.digirati.elucidate.infrastructure.extractor.selector.AbstractAnnotationSelectorExtractor:

                List<String> types = (List<String>) jsonSelector.get(JSONLDConstants.ATTRIBUTE_TYPE);
                if (types == null || types.size() != 1) {
                    return null;
                }

Hi @GermaineG,

The problem you've got here is that the annotation you are creating is not valid, as per the W3C Annotation Data Model:

https://www.w3.org/TR/annotation-model/#svg-selector

Specifically:

SVG Selectors must have exactly 1 type and the value must include SvgSelector.

This is the reason the code currently looks for a single "type". I don't see any reason why it couldn't necessarily support multiple "type"'s however, it's just a case of adding in the additional logic to handle something that is outwith the W3C specification.

I'll re-open this issue for further consideration.

Cheers,
Daniel.

Hi @danielgrant

Thank you for your very quick answer.
if there were only the svg selector, then I would disagree with you a bit, I think.

If there can only be one type (field) and it must include SvgSelector, then in my opinion a list of types in one type field is valid, as long as one of the elements in the list is "SvgSelector".
(Not that duplicate keys would be really possible in json-ld, I guess)

However, weirdly enough the svg selector is the only selector type where the sentence says "must include".
All other selector descriptions state:

[Selector type] must have exactly 1 type and the value must be [SelectorType]

Strange.
Since we always use svg selectors, we were under the impression that a list would be ok. This, however, indicates, that this was not intended by the specification.

Might even be worth an issue with the web annotation itself then.