SPARQL-Anything/sparql.anything

`fx:literal` should work for longer lang tags

Closed this issue · 4 comments

https://sparql-anything.readthedocs.io/en/stable/FUNCTIONS_AND_MAGIC_PROPERTIES/#fxliteral interprets its second arg ?b as:

typed literal (if a IRI is given)
lang code (if a string of length of two is given).

However, lang tags can be much longer, eg en-US, x-byzantin-Latn, zh-Latn-pinyin-x-notone
(see eg https://vocab.getty.edu/doc/#IANA_Language_Tags).

I would suggest to:

  • interpret any string ?b as a lang tag (since a datatype must be given as a IRI)
  • validate whether it matches the syntax for a lang tag.
    • rdf4j has functions to check this eg see eclipse-rdf4j/rdf4j#3695
    • I guess Jena has similar checks
    • If not, I can help to write such as check

I don't know Jena has a check for the literal (I could not find it in the code).

If not, I can help to write such as check

Sure, feel free to have a go, that's the pointer to the function:

validate whether it matches the syntax for a lang tag.

I will open a different issue for this

@enridaga, v1.0-DEV already does that: https://github.com/SPARQL-Anything/sparql.anything/blob/v1.0-DEV/sparql-anything-engine/src/main/java/io/github/sparqlanything/engine/functions/Literal.java#L33 checks for !v2.getString().isEmpty() but doesn't pose a length limit.

Close then?

Yep, we can close this, I opened #465 in case we want to add lang validation later