neosemantics

Installation

You can either download a prebuilt jar from the releases area or build it from the source. If you prefer to build, check the note below.

Copy the the jar(s) in the <NEO_HOME>/plugins directory of your Neo4j instance. (note: If you're going to use the JSON-LD serialisation format for RDF, you'll need to include also APOC)
Add the following line to your <NEO_HOME>/conf/neo4j.conf

dbms.unmanaged_extension_classes=semantics.extension=/rdf

Restart the server.
Check that the installation went well by running call dbms.procedures(). The list of procedures should include the ones documented below. You can check that the extension is mounted by running :GET /rdf/ping

Note on build

When you run

mvn clean package

This will produce two jars :

A neosemantics-[...].jar This jar bundles all the dependencies.
An original-neosemantics-[...].jar This jar is just the neosemantics bit. So go this way if you want to keep the third party jars separate. In this case you will have to add all third party dependencies (look at the pom.xml).

What's in this repository

This repository contains a set of stored procedures, user definded functions and extensions to integrate with RDF from Neo4j.

Stored Procedures and UDFs for RDF Parsing/Previewing/Ingesting

Stored Proc Name	params	Description and example usage
semantics.importRDF	URL of the dataset serialization format(*) map with zero or more params (see table below)	Imports into Neo4j all the triples in the data set according to the mapping defined in this post. Note that before running the import procedure an index needs to be created on property uri of Resource nodes. Just run `CREATE INDEX ON :Resource(uri)` on your Neo4j DB. Examples: CALL semantics.importRDF("file:///.../myfile.ttl","Turtle", { shortenUrls: false, typesToLabels: true, commitSize: 9000 }) CALL semantics.importRDF("http:///.../donnees.rdf","RDF/XML", { languageFilter: 'fr', commitSize: 5000 , nodeCacheSize: 250000})
semantics.previewRDF	URL of the dataset serialization format(*) map with zero or more params (see table below)	Parses some RDF and produces a preview in Neo4j browser. Same parameters as data import except for periodic commit, since there is no data written to the DB. Notice that this is adequate for a preliminary visual analysis of a SMALL dataset. Think how many nodes you want rendered in your browser. Examples: CALL semantics.previewRDF("https://.../clapton.n3","Turtle", {})
semantics.streamRDF	URL of the dataset serialization format(*) map with zero or more params (see table below)	Parses some RDF and streams the triples as records of the form subject, predicate, object plus three additional fields: a boolean indicating whether the object of the statement is a literal: `isLiteral` The datatype of the literal value if available `literalType` The language if available `literalLang` This SP is useful when you want to import into your Neo4j graph fragments of an RDF dataset in a custom way. Examples: CALL semantics.streamRDF("https://.../clapton.n3","Turtle", {})
semantics.previewRDFSnippet	An RDF snippet serialization format(*) map with zero or more params (see table below)	Identical to previewRDF but takes an RDF snippet instead of the url of the dataset. Again, adequate for a preliminary visual analysis of a SMALL dataset. Think how many nodes you want rendered in your browser :) Examples: CALL semantics.previewRDFSnippet('[{"@id": "http://indiv#9132", "@type": ... }]', "JSON-LD", { languageFilter: 'en'})
semantics.liteOntoImport	URL of the dataset serialization(*)	Imports the basic elements of an OWL or RDFS ontology, i.e. Classes, Properties, Domains, Ranges. Extended description here Example: CALL semantics.liteOntoImport("http://.../myonto.trig","TriG")
semantics.getIRILocalName	[function] IRI string	Returns the local part of the IRI (stripping out the namespace) Example: RETURN semantics.getIRILocalName('http://schema.org/Person')
semantics.getIRINamespace	[function] IRI string	Returns the namespace part of the IRI (stripping out the local part) Example: RETURN semantics.getIRINamespace('http://schema.org/Person')

(*) Valid formats: Turtle, N-Triples, JSON-LD, TriG, RDF/XML

Param	values(default)	Description
shortenUrls	boolean (true)	when set to true, full urls are shortened using generated prefixes for both property names, relationship names and labels
typesToLabels	boolean (true)	when set to true, rdf:type statements are imported as node labels in Neo4j
languageFilter	['en','fr','es',...]	when set, only literal properties with this language tag (or untagged ones) are imported
headerParams	map {}	parameters to be passed in the HTTP GET request. Example: { authorization: 'Basic user:pwd', Accept: 'application/rdf+xml'}
commitSize	integer (25000)	commit a partial transaction every n triples
nodeCacheSize	integer (10000)	keep n nodes in cache to minimize reads from DB

Note on namespace prefixes

If shortenUrls : true, you'll have prefixes used to shorten property and relationship names; and labels. You don't need to define your own namespaces prefixes as some of the most popular ones will be predefined for you (rdf, rdfs, owl, skos, sch, org) and for any other used in the imported dataset, the loader will automatically generate prefixes with the format ns0, ns1, etc. You can also define your own set of prefixes. For that you need to create (or merge, depending on whether it exists already) a NamesapcePrefixDefinition node before you perform the load of RDF data and the loader will use it:

// create the prefix mapping 
CREATE (:NamespacePrefixDefinition {
  `http://www.example.com/ontology/1.0.0#`: 'ex',
  `http://www.w3.org/1999/02/22-rdf-syntax-ns#`: 'rdf'})

Stored Procedures for Schema (Ontology) Mapping

Stored Proc Name	params	Description and example usage
semantics.mapping.addSchema	URL of the schema/vocabulary/ontology prefix to be used in serialisations	Creates a reference to a vocabulary. Needed to define mappings. Examples: call semantics.mapping.addSchema("http://schema.org/","sch")
semantics.mapping.dropSchema	URL of the schema/vocabulary/ontology	Deletes a vocabulary reference and all associated mappings. Examples: call semantics.mapping.dropSchema("http://schema.org/")
semantics.mapping.listSchemas	[optional] search string to list only schemas containing the search string in their uri or in the associated prefix	Returns all vocabulary references. Examples: call semantics.mapping.listSchemas() call semantics.mapping.listSchemas('schema') Combining list and drop to delet a set of schemas by name: CALL semantics.mapping.listSchemas("fibo") YIELD node AS schemaDef WITH schemaDef, schemaDef._ns AS schname CALL semantics.mapping.dropSchema(schemaDef._ns) YIELD output RETURN schname, output
semantics.mapping.addCommonSchemas		Creates a references to a number of popular vocabularies including schema.org, Dublin Core, SKOS, OWL, etc. Examples: call semantics.mapping.addCommonSchemas()
semantics.mapping.addMappingToSchema	The mapping reference node (can be retrieved by addSchema or listSchemas) Neo4j DB schema element. It can be either a Label, property key or relationship type Local name of the element in the selected schema (Class name, DataTypeProperty name or ObjectProperty name)	Creates a mapping for an element in the Neo4j DB schema to a vocabulary element. Examples: Getting a schema reference using listSchemas and creating a mapping for it: call semantics.mapping.listSchemas("http://schema.org") yield node as sch call semantics.mapping.addMappingToSchema(sch,"Movie","Movie") yield node as mapping return mapping
semantics.mapping.dropMapping	mapped DB element name to remove the mapping for	Returns an output text message indicating success/failure of the deletion. Examples: call semantics.mapping.dropMapping("Person")
semantics.mapping.listMappings	[optional]search string to list only mappings containing the search string in the DB element name	Returns a list with all the mappings. Examples: call semantics.mapping.listMappings()

Extensions

Extension	params	Description and example usage
/rdf/describe/id	nodeid:the id of a node excludeContext:(optional) if present output will not include connected nodes, just selected one.	Produces an RDF serialization of the selected node. The format will be determined by the accept parameter in the header. Default is JSON-LD Example: :GET /rdf/describe/id?nodeid=0&excludeContext
/rdf/describe/uri	nodeuri:the uri of a node excludeContext:(optional) if present output will not include connected nodes, just selected one.	Produces an RDF serialization of the selected node. It works on a model either imported from an RDF dataset via semantics.importRDF or built in a way that nodes are labeled as :Resource and have an uri. This property is the one used by this extension to lookup a node. [NOTE: URIs should be urlencoded. It's normally not a problem unless there are hash signs in it (escape them in the Neo4j browser with %23)] Example: :GET /rdf/describe/uri?nodeuri=http://dataset.com#id_1234
/rdf/cypher	JSON map with the following keys: cypher:the cypher query to run showOnlyMapped:(optional, default is false) if present output will exclude unmapped elements (labels,attributes, relationships)	Produces an RDF serialization of the nodes and relationships returned by the query. Example: :POST /rdf/cypher { "cypher" : "MATCH (n:Person { name : 'Keanu Reeves'})-[r]-(m:Movie) RETURN n,r,m " , "showOnlyMapped" : true }
/rdf/cypheronrdf	JSON map with the following keys: cypher:the cypher query to run	Produces an RDF serialization of the nodes and relationships returned by the query. It works on a model either imported from an RDF dataset via semantics.importRDF or built in a way that nodes are labeled as :Resource and have an uri. Example: :POST /rdf/cypheronrdf { "cypher":"MATCH (a:Resource {uri:'http://dataset/indiv#153'})-[r]-(b) RETURN a, r, b"}

Contributing

neosemantics code formatting follows the Google Java Style Guide.

In order to contribute to this project, it is advisable to install the code formatting configuration in your preferred IDE:

Please, make sure you format your code before commiting changes. Thanks!

irokin/neosemantics