json-ld/yaml-ld

sigil: change prefix char in context

Closed this issue · 13 comments

As an information architect and developer.
I want to write YAML-LD keywords using identifier chars accepted in YAML and in my programming language.
So that I can access them using dot notation, rather than using "string index" bracketed notation.

Example: if the prefix char is _ :

  • I can have more readable YAML without quoting keys (see Example with context below)
  • I can write this in JavaScript:
label._none // with Language Indexing, if the label has no lang tag
label._value // with langString

instead of

label["@none"]
label["@value"]

(see JSON-LD Language Indexing)


JSON-LD takes a per-keyword approach, i.e. you can define keyword aliases, eg

"@context":{
  "type":"@type",
  "id":"@id",
  "lang":"@language",
  "none":"@none"
}

A more uniform way in YAML-LD could be to specify the prefix char ($ or _ or even empty) with an option.

Example with context (TODO make more)

  "@context":
    "@sigil": $
    $base: http://example.org/resource/
    $vocab: http://example.org/ontology/
  $graph:
    $id: bart
    spouse: marge

TODO: is there any way to avoid the use of @ in the first 2 lines?

@ioggstream and @anatoly-scherbakov I tried \ escaping but it doesn't work at https://onlineyamltools.com/convert-yaml-to-json, maybe that convertor is non-conforming?

\@context:
  \@sigil: $
  $base: http://example.org/resource/
  $vocab: http://example.org/ontology/
$graph:
  $id: bart
  spouse: marge

JSON Schema includes some $ keywords:
"$schema, $vocabulary, $defs, $ref, $id, $anchor, $comment, $dynamicRef, $dynamicAnchor`

  • on one hand, if we come up with some YAML beast to combine YAML-LD Context and YAML Schema (see #19, I think @OR13 uses such things) it would be nice to use the same sigil for keywords
  • on the other hand, we should look out for conflicts
    • @id is a conflict with $id
    • @vocab is a near-conflict with $vocabulary (i.e. could be confusing)
  • But maybe there is no problem if these keywords are localized to the Context vs Schema parts?
    • After all, @id is already "overloaded" in JSON-LD:
"@container": "@id" # Node Identifier Indexing
"@id": "bart"             # Node identifier
"@id": {"@id": "bart", "age": 42}  # triple, for which RDF-star annotations will follow

Related issues:

  • Even if #42 is rejected, this still applies since it affects YAML use in programs
  • I think we can close #9 because it has a lot of great discussion but it's unfocused: @ioggstream do you agree?
OR13 commented

IMO, its better to conform to string syntax than translate... dot notation isn't worth the complexity.

I prefer to reuse as much JSON-LD syntax as possible with YAML-LD.

IMO, adding another syntax will just make RDF usability even worse.

@VladimirAlexiev I think you are right, the \@ preserves the \ in pyyaml. It is possible that when I tested it I had some code that removed the \.

I will fix the examples in the spec. It is key to add the test code to the repo so that anyone can run the same tests. Sorry for the mistake.

I agree on closing the discussion on $ vs @ since people that want to use $ can do it via the context you propose.
IMO I won't suggest that approach because of possible clashes with current or specifications.
When I see "@" I know it's LD, when I see "$" I know it's JSON Schema... Since I travel these boundaries quite often (see https://ioggstream.github.io/draft-polli-restapi-ld-keywords/draft-polli-restapi-ld-keywords.html) I prefer to avoid these kind of workaround.

Q: any luck in unreserving "@" in YAML ? We could propose to use another UTF-8 char for that :P @VladimirAlexiev

@OR13

dot notation isn't worth the complexity

It's not only about field access in programming languages, but also about YAML readability. Edited the description.

reuse as much JSON-LD syntax as possible with YAML-LD.

@ is the default.

Adding another syntax will just make RDF usability even worse

We're talking YAML usability.
JSON-LD already allows you to alias keywords, which is crucially important if you want the ability to interpret a given JSON as a good RDF.
This issue just gives a uniform way to alias keywords.

@ioggstream

When I see "@" I know it's LD, when I see "$" I know it's JSON Schema... I travel these boundaries quite often

Wouldn't it be nice to "construct a path" so you don't need to cross any boundaries, and can think about your data model rather than the various modeling mechanisms?

Orie's LD additions use $, Roberto's use x-jsonld-context to escape into JSONLD, then @

any luck in unreserving "@" in YAML

No movement on yaml/yaml-spec#286

As I just noted in #11, I suggest we use contexts instead of some hard coded option to convert keywords. This is a much more generic mechanism. It seems fair to me to say that YAML-LD is actually a tool to build Semantic DSLs (Domain Specific Languages). Just like YAML is normally used: every configuration file or data file format based on YAML is actually a DSL.

So the developers of YAML-LD aware systems will likely conceive their own syntaxes for these DSLs. Our Convenience Context would be merely a reusable example for such.

Thoughts?

OR13 commented

Technically my additions are to OAS / JSON Schema, which is why we chose $.

Where that is the convention.

@anatoly-scherbakov I agree with you, that option (eg @sigil) should be specified in the context. Edited the description.

@gkellogg Can you come up with some twisted examples where the sigil is switched midway in a YAML file?
Can this lead to some ambiguities?

(You can switch @base and @prefix midway in a Turtle file but I never do it, as I value my sanity :-))

@BigBlueHat's idea is to create a "Convenience Context" with bare-word terms for keywords, which is already a common practice. Put it in the best practices document, and it handles most use.

See his context at https://github.com/json-ld/convenience-context.

@gkellogg Can you come up with some twisted examples where the sigil is switched midway in a YAML file? Can this lead to some ambiguities?

There are JSON-LD tests for this; generally term definitions are additive, although they can be removed by defining as null, or adding a completely null context. It doesn't pose any issues for proper interpretation. If people are worried about this as an attack vector, Protected Term Definitions provide a way to control this.

@BigBlueHat The "Convenience Context" is a great idea.
But @gkellogg when did you guys decide to strip the special char instead of using $?

  • Myself and I think @anatoly-scherbakov @OR13 @ioggstream prefer $
  • Looking at the "Convenience Context", there are a great many bare words that people may want to use in their data eg
    id type value version
    • for a related example, see schemaorg/schemaorg#1553: the schema.org context clobbers <geo:51.36824,-0.40229> into <http://schema.org/geo51.36824,-0.40229> :-(
  • JSON-LD uses a special char for all its keywords, and I think we should also use a special char for YAML-LD keywords
  • Using $ would match the JSON Schema practice.
    $schema, $vocabulary, $defs, $ref, $id, $anchor, $comment, $dynamicRef, $dynamicAnchor
    • We only need to decide how to resolve the conflict on $id

Note: #85 is closely related.

OR13 commented

+1 to starting with $

@VladimirAlexiev said:

But @gkellogg when did you guys decide to strip the special char instead of using $?

This would have been discussed in a meeting, and you can see all the minutes here. The thought was that this feature can be achieved easily using a context; otherwise, it would require more than simply transforming YAML to the internal representation, and could be subject to other errors.

By creating aliases for say $id as an alias for @id in the context, authors can use $id terms. The limitation is that this cannot be done within the term definition in a context, itself, which seemed like a reasonable limitation.

Otherwise, the processing steps are more involved, and potentially would deal with conflicts in overlapping terms (i.e., if $id happened to be defined differently for some odd reason). Simplest thing is to not change the basic JSON-LD syntax, just allow it to be expressed in YAML.

Current Convenience Context in the spec does use $.

@anatoly-scherbakov @gkellogg
The spec now refers to

I think that's enough, so would be happy to close this. Do you agree?

@VladimirAlexiev thank you for pointing attention to this issue! I do agree; I believe we've arrived at a solution (via contexts), and provided example contexts. So I am closing this.