ruby-rdf/rdf

Query not returning expected value

typhoon2099 opened this issue · 4 comments

Given the following html:

<div itemscope="" itemType="http://schema.org/Product">
  <meta itemProp="name" content="Product Name"/>
</div>

and the following Query:

RDF::Query.new do
    pattern [:product, RDF.type, RDF::URI("http://schema.org/Product")]
    pattern [:product, RDF::URI("http://schema.org/name"), :name]
    pattern [:product, RDF::URI("http://schema.org/offers"), :offer], optional: true

    pattern [:offer, RDF.type, RDF::URI("http://schema.org/Offer")], optional: true
    pattern [:offer, RDF::URI("http://schema.org/offeredBy"), :seller], optional: true

    pattern [:seller, RDF.type, RDF::URI("http://schema.org/Organisation")], optional: true
    pattern [:seller, RDF::URI("http://schema.org/name"), :seller_name], optional: true
  end

I am given the following solution:

{
  "product": "_:g47363778091060",
  "name": "Product Name",
  "seller": "_:g47363778091060",
  "seller_name": "Product Name"
}

I would expect both seller and seller_name to not be present in the returned solution.

I'll need to look into this further over the weekend.

Have you been able to reproduce this issue?

Sorry, other commitments along with dealing with power-outs from fires kept me from this.

I can reproduce the issue, and looking at it, it is actually doing the right thing. It's important to note that each optional pattern is taken on its own, and the lack of binding of one pattern does not relate to another.

In your case, the non-optional patterns bind as follows:

{
  product: _:bn,
  name: "Product Name"
}

Then, each subsequent (optional) pattern is run against this.

  • ?offer rdf:type schema:Offer has no solutions and does not add anything to the solution.
  • ?offer schema:offeredBy ?seller also has no solutions.
  • ?seller rdf:type schema:Organization has no solutions
  • ?seller schema:name ?seller_name does have a solution, the same as for the product, so it is added to the solution set, giving the following:
{
  product: _:bn,
  name: "Product Name",
  seller: _:bn2,
  seller_name: "Product Name"
}

Clearly, you would expect that ?seller wouldn't bind, as it should be of type schema:Organization, but these are independent patterns.

What you really want is to do a left-join of different BGP queries, which is how SPARQL does it. In RDF.rb, you could do this as follows:

query1 = RDF::Query.new do
  pattern [:product, RDF.type, RDF::URI("http://schema.org/Product")]
  pattern [:product, RDF::URI("http://schema.org/name"), :name]
  pattern [:product, RDF::URI("http://schema.org/offers"), :offer], optional: true
end

query2 = RDF::Query.new do
  pattern [:offer, RDF.type, RDF::URI("http://schema.org/Offer")], optional: true
  pattern [:offer, RDF::URI("http://schema.org/offeredBy"), :seller], optional: true
end

query3 = RDF::Query.new do
  pattern [:seller, RDF.type, RDF::URI("http://schema.org/Organisation")], optional: true
  pattern [:seller, RDF::URI("http://schema.org/name"), :seller_name], optional: true
end

solutions1 = query1.execute(graph)
solutions2 = query3.execute(graph)
solutions3 = query3.execute(graph)

solutions12 = RDF::Query::Solutions()
solutions1.each do |s1|
  solutions2.each do |s2|
  solutions12 << s2.merge(s1) if s1.compatible?(s2)
end

solutions = RDF::Query::Solutions()
solutions12.each do |s12|
  solutions3.each do |s3|
  solutions <<  s3.merge(s12) if s12.compatible?(s3)
end

This is basically what SPARQL::Algebra::Operator::LeftJoin does. But, really, you're better off using the SPARQL gem for complex queries.

Right, I think I'm getting it now. I think based on what you've told me then I'll have to look into SPARQL. I've been avoiding adding gems into our project unless necessary as performance and memory efficiency is a concern, but it looks like it will save a lot of headaches.

Thanks!