rdfjs/data-model-spec

Support for prefix map in .namedNode()

Closed this issue · 20 comments

It'd be nice to have factory.namedNode('ex:test') be translated to a NamedNode with value http://example.com#test according to the map {ex: 'http://example.com#'}.

We've forked rdf-data-model and added a basic implementation of this in https://github.com/beautifulinteractions/rdf-data-model . We've added the following methods:

  • DataFactory .addPrefix(String prefix, String namespace)
  • DataFactory .addPrefixes(Object prefixMap)
  • DataFactory .delPrefix(String prefix)
  • DataFactory .delPrefixes(Object prefixMap)

Seems good functionality to have. Some questions though:

  1. Does this belong in the low-level or high-level interface?
  2. How to decide when to expand? (e.g., what if I've registered the http prefix?)
  3. Why del instead of remove?

Regarding 1., I don't think that this is necessary for the low-level interface (but very useful to offer for high-level interfaces).

  1. I'd say low-level. Personally, I find RDF without prefixes to be highly impractical.
  2. Always expand if whatever comes before the first : matches a known prefix.
  3. No real reason. remove is much better.
  1. This puts a burden on the low-level library, having to check for prefixes every time. I agree on it being impractical, but then again, the low-level library is not designed for that purpose.

I see your point and I agree, although I'm conflicted as I'd still like to have prefix resolution in .namedNode() based on how I have been using it so far. Nonetheless, you're right.

An alternative implementation for the low-level interface - just throwing some ideas out there - would be to have a dedicated method for expansion String .expand(String curie).

+1 for generic expansion method in any case (in addition to inline expansion in some other cases)

  • +1 for del -> remove
  • How can the prefix map be read? Can we define a property prefixes with a read only object for the key/values?
  • Should we make the distinction between low and high level or just make it optional like variable? I would vote for optional. The property prefixes or the method expand could be used to detect support for that feature.

Prefix handling could be in low-level API in my view. It is a very general and usable feature, not dependent on specific use cases or onthologies. Why asking every developer to reinvent a way to add and remove prefixes? They should be able to throw data at the factory with less preparation as possible, and expect every method to handle prefixes instead of doing it themselves.
I don't know if that rises too much difficulty for the implementation on our side...

Prefix handling could be in low-level API in my view.

We're not advocating the opposite; the question is rather whether every method (such as the factory methods) should do this by default.

Why asking every developer to reinvent a way to add and remove prefixes?

That's the wrong question—we're asking every developer already to implement a library anyways 😉 Recall that RDF/JS is a specification; implementers will still need to write this anyway.

The question is not: should libraries implement it? The question is: should the low-level API mandate automatic expansion at every step for all libraries?

They should be able to throw data at the factory with less preparation as possible, and expect every method to handle prefixes instead of doing it themselves.

Some libraries might very well choose to do so; libraries that favor speed might choose not to.

That's the wrong question—we're asking every developer already to implement a library anyways 😉 Recall that RDF/JS is a specification; implementers will still need to write this anyway.

Got the distinction, and still think that is preferable to handle prefixes at the lowest level possible.
Also, from a practical point of view the spec will come with some reference implementations made by the same guys who are in the task force and most people will stick to those ones.

The question is not: should libraries implement it? The question is: should the low-level API mandate automatic expansion at every step?

Yes on the reframed question.

Some libraries might very well choose to do so; libraries that favor speed might choose not to.

You're right, din't think about that. The main CONS maybe are: 1. speed and 2. added complexity. The PRO is usability.

I vote for a dedicated String .expand(String curie) method at the low-level (in addition to the (add|remove)Prefix(es) methods) and for opening a new issue about prefixes at the high-level. The method should return either the expanded form of the string (if a known prefix is found) or the original string (if no known prefix is found).

  • +1 to a general .expand() method
  • +1 to @bergos's suggestion of a read-only prefixes property to access the prefix map

I'd prefer to see any sort of context-specific mappings done at a higher level and keep the low level library simple and clean.

Not everyone needs or wants to use prefixes. I've also found that supporting curies/prefixes seems to promote additional complexity creeping into other places you weren't originally expecting -- in order to make everything work in a consistent and useful way.

Please keep prefixes away from the low-level API. This is a bit like opening pandora's box because there a tons of shortcuts and "it'd be nice to have this" features that would easily bog down what really should be a skeletal interface for creating, storing, (querying?) and processing RDF in javascript.

Furthermore, any notion of using context-sensitive or syntax-specific strings should belong to their own family of methods -- why pollute the .namedNode method? If someone wants to use a prefixed name, it's not like they don't know whether or not it's an expanded IRI; so why make the computer deduce the format of the string every time? So that developers can be haphazard and lazy? I'm not convinced.

Polluting .namedNode is a bad idea. @blake-regalia, @dlongley, @RubenVerborgh - you're absolutely right. A dedicated method is a much better solution.

As far as the low-level vs. high-level issue, as an application developer (and thus an end user of the RDF/JS spec) I'd personally like to see support for prefixes as close to my entry point in the RDF world as possible. My impression from working on node-quadstore is that application developers will adopt libraries implementing the low-level spec regardless of whether they will also adopt libraries implementing the high-level spec. If this is the case, then I believe support for prefixes belongs to the low-level spec. It would not make sense to have prefixes as first-class citizens in every major component of the RDF landscape (SPARQL, turtle, jsonld, ...) and have no support for them in RDF/JS. All those components can be used without prefixes, sure. However, in my limited experience as an application developer, doing so quickly diminishes the whole experience.

That said, I'm very new to this world and very happy to outgrow any wrong assumption or downright silly idea I have developed so far. Everything is very much IMHO with emphasis on the H.

Consider the discussions we've had about similar topics in the past and how they've ultimately, more or less, led to the conclusion that consistency and simplicity are paramount for a low-level API.

  • The 'prefix' event: #97
  • Literal's .datatype property (i actually switched camps on this one): #78 , #93 , and #83

Another thing to keep in mind is that everyone has different use cases, and some of those have no need for prefixes at all. A bulky or over-complicated interface may lead to a general lack of adoption or partial implementations when what we are striving for is universal compatibility. This is a bit of an ongoing discussion about the goals of the low-level vs high-level APIs -- but in my opinion, too many bells and whistles being crammed into a so-called 'low-level' API spells certain doom.

@blake-regalia thank you for pointing me to those discussions. I think I need to do more research on the low-level vs. high-level aspect of RDF/JS before returning to this issue. We'll implement @RubenVerborgh's and @bergos' suggestions in our own fork of rdf-data-model in the meantime and update it to match any consensus reached by the workgroup in the future.

l00mi commented

If I get the gist this Issue talks about what should be considered in the low-level API in regards of prefix extension. Please open up another Issue if you like to propose more functions for the high-level api.

Noticing this issue in rdflib.js linkeddata/rdflib.js#245
@jacoscaz how do you see using different prefix maps? Each instance of DataFactory keeps prefix map specific to it and to use different prefix maps one would just create different instances of DataFactory? So to serialize same dataset twice using two different prefix maps, one would simply pass different instance of DataFactory to a parser each time?
Looking at the issue of passing baseIri to parser/serializer https://github.com/rdfjs/representation-task-force/issues/96 it seems to me that mentioned there options parameter could also handle optional prefix map. This way seems cleaner to me - passing around an instance of some PrefixMap, rather than having instance of DataFactory having particular prefix map attached to it.

@elf-pavlik apologies for the delay, I hadn't seen the email notification about your message. Yes, I think using options would be a good way to address this. I had initially thought about attaching to DataFactory but I dislike the resulting state encapsulation of that. +1 for options!

Closed based on the resolution in #136