mknd-io/ontologies

Generating a datomic schema from mekanoid.ttl

Closed this issue · 0 comments

I concatenated https://github.com/mknd-io/schema/blob/main/datasets/mekanoid-app.ttl with https://github.com/mknd-io/schema/blob/main/ontologies/mekanoid/d3fend-mknd.ttl and https://github.com/mknd-io/schema/blob/main/ontologies/mekanoid/mekanoid.ttl to create a single TTL file.

The only issue I encountered was a missing reference to xsd:longString. I changed it to xsd:string in my test, and can issue a PR with that change if it makes sense.

https://github.com/mknd-io/schema/blob/dd005b061a17655c9e4dcdb45bd30beffe2115ea/ontologies/mekanoid/mekanoid.ttl#L86

Here is the resulting Clojure namespace containing all of the resources that was generated by code:
https://gist.github.com/aamedina/4ba6f63d9c7eb3c2c73d4a456af5923f

This file gets loaded (along with d3fend and a bunch of other RDF vocabularies which also exist as Clojure namespaces) when I start the system via a Clojure REPL. Then I generate a schema for Datomic based on the all of the loaded RDF models and can play with it interactively.

I have found this helpful as a sanity check for finding missing references, and it is also a nice environment to query for me using datalog. https://docs.datomic.com/cloud/query/query-data-reference.html It's very similar to SPARQL, but I am just getting started learning SPARQL FWIW.

Once you have the schemas you can transact any instances with those properties and classes into the db in the cloud or local. (I use Datomic on AWS, you basically just change the config file from "local dev" mode to "cloud" mode, which I find very cool as a solo dev.)

The Clojure data I get from "datafying" an ident (pulling all RDF triples related to a named resource into a map) is what I am embedding for semantic search using "text-embedding-ada-002". Each vector has a payload with the label stored in Qdrant derived from its :db/ident. IE :mknd/MekaThing could be embedded and queried to find similar things in other vocabularies outside of Mekanoid.

I curate different collections in Qdrant for myself. Some (most) are RDF models themselves to search. Like, I want a property for "a name for something". Is there something out there? Search the vocab in Qdrant... Oh, schema:name, foaf:name, sioc:name, doap:name, etc. But I also embed entire datasets (like all legal Magic: The Gathering cards up until the recent LOTR expansion) https://github.com/aamedina/mtg and can do semantic recommendations and search for a more entertaining example. Also STIX data with MITRE https://github.com/mitre-attack/attack-stix-data performs well alongside D3FEND embeddings thanks to the ATT&CK technique references. (There is also a STIX ontology)

I hope some of this makes sense to someone else!