Cross-product of links between instances
namedgraph opened this issue ยท 18 comments
Issue type: ๐ Bug
sources:
Smth:
access: input.ndjson
referenceFormulation: jsonpath
iterator: "$.key[*]"
mappings:
Concept:
sources: Smth
graph: smth:$(@)
s: smth:$(@)#this
po:
- [ a, skos:Concept ]
- [ skos:prefLabel, $(@) ]
Document:
sources: Smth
graph: smth:$(@)
s: smth:$(@)
po:
- [ a, foaf:Document ]
- p: foaf:primaryTopic
o:
- mapping: Concept
I am getting a cross-product in the output, i.e. if there are N rows, I'm getting this Document
output:
<instance1> foaf:primaryTopic <instance1#this> .
<instance1> foaf:primaryTopic <instance2#this> .
...
<instance1> foaf:primaryTopic <instanceN#this> .
<instance2> foaf:primaryTopic <instance1#this> .
<instance2> foaf:primaryTopic <instance2#this> .
...
<instance2> foaf:primaryTopic <instanceN#this> .
...
<instanceN> foaf:primaryTopic <instance1#this> .
<instanceN> foaf:primaryTopic <instance2#this> .
...
<instanceN> foaf:primaryTopic <instanceN#this> .
Where I only want to simply "pair" respective Document
and Concept
instances:
<instance1> foaf:primaryTopic <instance1#this> .
<instance2> foaf:primaryTopic <instance2#this> .
...
<instanceN> foaf:primaryTopic <instanceN#this> .
Is there a way to express what I need with YARRRML?
Hi @namedgraph, how does your input data look like?
One row looks like this:
{"key":["aaaaaa","bbbbbbbb","cccc","ddddddd"]}
I added sources
to the mapping BTW.
It's only possible if there is a unique way to identify a row and the only link the rows that are the same. For example, if the every row has a index you can add a condition
to your mapping
so that it only links when the indexes of the rows as equal.
I see... I might need to add that.
These are annoying shortcomings IMO. It would not be a problem using XSLT, for example.
But wait... I would need a different iterator
then?
No, that is not needed.
But in the mapping I'm using array item (e.g. "cccc"
) as the value: $(@)
It's those values I need to compare, not row IDs. If I was to add IDs for those values, I would need to change the whole JSON structure within the array?
No, that's not needed. The following works for me
sources:
Smth:
access: data.json
referenceFormulation: jsonpath
iterator: "$.key[*]"
mappings:
Concept:
sources: Smth
s: ex:$(@)#this
po:
- [ a, skos:Concept ]
- [ skos:prefLabel, $(@) ]
Document:
sources: Smth
s: ex:$(@)
po:
- [ a, foaf:Document ]
- p: foaf:primaryTopic
o:
- mapping: Concept
condition:
function: equal
parameters:
- [str1, $(@)]
- [str2, $(@)]
Thanks! Will try.
I considered this, but it wasn't obvious to me how comparing $(@)
to $(@)
could ever be false
?
We compare every element in the array (via Concept
) with every element in the array (via Document
). So we have
aaa
andaaa
: this is what you want, sotrue
.aaa
andbbb
: you don't want this link, sofalse
.aaa
andccc
: againfalse
bbb
andaaa
:false
bbb
andbbb
:true
, we want to link these.
That part I understand. But doesn't that mean that $(@)
in str1
refers to a different value than $(@)
in str2
?
In str1
we refer to the rows in Document
and in str2
we refer to the rows in Concept
.
That explains the result, and it will be useful in my case. But my point is that it's counter-intuitive and unusual for the same variable ($(@)
) to refer to different values in the same context.
I just tried your suggestion with condition
and it doesn't work for me -- the result is the same as without it.
Are you sure you tested with more than one row of JSONL?
Well, it's not the same context actually, but for equal
the context has a default if the user doesn't provide one. This is explained here:
But when a condition is used an extra value can be given to a parameter of a function. This is either s or o. s means that the value of the parameter is coming from the subject of the relationship, while o means that the value is coming from the object of the relationship. The default value is s. In this example it would result in relationships between every person and their projects.
Regarding JSONL, by default only standard JSON is supported.
Disregard the JSONL comment...
I managed to reproduce your condition
-based results using the rmlio/rmlmapper-java
Docker image, but not in the Java code (using be.ugent.rml:rmlmapper:6.1.3
) ๐ค
Turns out it's a bug in our custom executor ๐ Sorry for the noise.