pchampin/sophia_rs

Issue serializing to TriG `with_pretty` when blank nodes

Closed this issue · 2 comments

Hi, I faced issues while serializing trig RDF containing blank nodes using with_pretty

I used this code:

let mut dataset: LightDataset = trig::parse_str(rdf)
            .collect_quads()
            .expect("Failed to parse RDF"); 

let trig_config = TrigConfig::new()
            .with_pretty(true)
            .with_prefix_map(&prefixes[..]);
let mut trig_stringifier = TrigSerializer::new_stringifier_with_config(trig_config);

trig_stringifier.serialize_dataset(&dataset)
                .expect("Unable to serialize dataset to trig")
                .to_string()

Where I parse this trig RDF:

@prefix : <http://purl.org/nanopub/temp/mynanopub#> .
@prefix drugbank: <http://identifiers.org/drugbank/> .
@prefix np: <http://www.nanopub.org/nschema#> .
@prefix pav: <http://purl.org/pav/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix infores: <https://w3id.org/biolink/infores/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix orcid: <https://orcid.org/> .
@prefix biolink: <https://w3id.org/biolink/vocab/> .
@prefix pmid: <http://www.ncbi.nlm.nih.gov/pubmed/> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix npx: <http://purl.org/nanopub/x/> .

:Head {
    : a np:Nanopublication ;
        np:hasAssertion :assertion;
        np:hasProvenance :provenance;
        np:hasPublicationInfo :pubInfo .
}

:assertion {
    drugbank:DB10771 a biolink:Drug .

    <http://purl.obolibrary.org/obo/OMIM_130000> a biolink:Disease .

    :association rdf:object <http://purl.obolibrary.org/obo/OMIM_130000>;
        rdf:predicate biolink:treats;
        rdf:subject drugbank:DB10771;
        biolink:context [a biolink:Context];
        biolink:target [a biolink:Target];
        biolink:date [a biolink:Date];
        a biolink:ChemicalToDiseaseOrPhenotypicFeatureAssociation .

    [] a biolink:Thing .
    [] a biolink:Date .
    [] a biolink:Context .
    :_1 a biolink:Thing .
    :__2 a biolink:Thing .
}

But when I re-serialize it to trig with pretty using sophia I am getting the following:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX schema: <http://schema.org/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX biolink: <https://w3id.org/biolink/vocab/>
PREFIX np: <http://www.nanopub.org/nschema#>
PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX npx: <http://purl.org/nanopub/x/>
PREFIX nptemp: <http://purl.org/nanopub/temp/mynanopub#>

GRAPH nptemp:Head {
  <http://purl.org/nanopub/temp/mynanopub#> a np:Nanopublication;
    np:hasAssertion nptemp:assertion;
    np:hasProvenance nptemp:provenance;
    np:hasPublicationInfo nptemp:pubInfo.
}

GRAPH nptemp:assertion {
  <http://identifiers.org/drugbank/DB10771> a biolink:Drug.

  <http://purl.obolibrary.org/obo/OMIM_130000> a biolink:Disease.

  nptemp:_1 a biolink:Thing.

  nptemp:__2 a biolink:Thing.

  nptemp:association a biolink:ChemicalToDiseaseOrPhenotypicFeatureAssociation;
    rdf:object <http://purl.obolibrary.org/obo/OMIM_130000>;
    rdf:predicate biolink:treats;
    rdf:subject <http://identifiers.org/drugbank/DB10771>;
    biolink:context [];
    biolink:date [ a biolink:Target];
    biolink:target [ a biolink:Context].

  [ a biolink:Date] a biolink:Thing.

   a biolink:Date.

   a biolink:Context.
}

There are 2 different issues:

  • The blank nodes are not right, it seems like the index is moved of 1 in the pretty step of the serializer
  • The serializer is missing the [] when representing the blank nodes, cf. a biolink:Context.

When serializing without pretty the blank node situation is right:

<http://purl.org/nanopub/temp/mynanopub#Head> {
	<http://purl.org/nanopub/temp/mynanopub#> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.nanopub.org/nschema#Nanopublication> ;
		<http://www.nanopub.org/nschema#hasAssertion> <http://purl.org/nanopub/temp/mynanopub#assertion> ;
		<http://www.nanopub.org/nschema#hasProvenance> <http://purl.org/nanopub/temp/mynanopub#provenance> ;
		<http://www.nanopub.org/nschema#hasPublicationInfo> <http://purl.org/nanopub/temp/mynanopub#pubInfo> .
}
<http://purl.org/nanopub/temp/mynanopub#assertion> {
	<http://identifiers.org/drugbank/DB10771> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/biolink/vocab/Drug> .
	<http://purl.obolibrary.org/obo/OMIM_130000> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/biolink/vocab/Disease> .
	<http://purl.org/nanopub/temp/mynanopub#association> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/biolink/vocab/ChemicalToDiseaseOrPhenotypicFeatureAssociation> ;
		<http://www.w3.org/1999/02/22-rdf-syntax-ns#object> <http://purl.obolibrary.org/obo/OMIM_130000> ;
		<http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate> <https://w3id.org/biolink/vocab/treats> ;
		<http://www.w3.org/1999/02/22-rdf-syntax-ns#subject> <http://identifiers.org/drugbank/DB10771> ;
		<https://w3id.org/biolink/vocab/context> _:riog00000001 ;
		<https://w3id.org/biolink/vocab/target> _:riog00000002 ;
		<https://w3id.org/biolink/vocab/date> _:riog00000003 .
	_:riog00000001 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/biolink/vocab/Context> .
	_:riog00000002 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/biolink/vocab/Target> .
	_:riog00000003 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/biolink/vocab/Date> .
	_:riog00000004 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/biolink/vocab/Thing> .
	_:riog00000005 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/biolink/vocab/Date> .
	_:riog00000006 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/biolink/vocab/Context> .
	<http://purl.org/nanopub/temp/mynanopub#_1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/biolink/vocab/Thing> .
	<http://purl.org/nanopub/temp/mynanopub#__2> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/biolink/vocab/Thing> .
}

narrowing down to a minimal example, I reproduced this bug with the following dataset:

@prefix : <https://example.org/> .

:s :p :o.

:g {
  _:b :p2 :o2.
}

which (pretty-)serializes back to

<https://example.org/s>
  <https://example.org/p> <https://example.org/o>.

GRAPH <https://example.org/g> {
  
    <https://example.org/p2> <https://example.org/o2>.
}

which is a syntax error (missing subject in the second-to-last line).

thanks for spotting that bug.
FYI, the fix has been propagate to SoWasm. See example.