Duplicate entities in graph. Link using @id instead
jose-gomez-evinex opened this issue ยท 8 comments
I guess this is a question instead of an issue. Is there any way to handle this automatically?
For example, I'm adding two entities to my graph (BlogPosting and Organization):
$graph->add( $post, 'my_blog_post' );
$graph->add( $organization, 'my_organization' );
$graph->blogPosting( 'my_blog_post' )
->author($graph->organization('my_organization'))
->publisher($graph->organization('my_organization'));
The output generates three instances of the Organization entity, one in the top level of the graph and two as properties of BlogPosting. I guess the desired output would be to have one Organization entity as top level and then two linked @ ids as properties.
{
"@type" : "BlogPosting",
"@id" : "https://mysite.com/my_post_title/#blog_posting",
"url" : "https://mysite.com/my_post_title/",
"author" : { "@id":"https://mysite.com/#organization" },
"publisher" : { "@id":"https://mysite.com/#organization" }
}
Am I missing something?
Thanks for your excellent project and support!
Hey,
you don't miss something and it's 100% right. Right now we don't support auto-linking as there's really too much going on to decide ourselves which entity should be linked and which one is the original.
One idea I have would be a method so the user can decide which instance is only linked.
$graph = new Graph();
$graph->blogPosting( 'my_blog_post' )
->identifier('https://mysite.com/my_post_title/#blog_posting');
$graph->organization('my_organization')
->identifier('https://mysite.com/#organization');
$graph->blogPosting( 'my_blog_post' )
->author($graph->organization('my_organization')->linked())
->publisher($graph->organization('my_organization')->linked());
I will have to read a bit deeper what is needed for a linked instance - but the idea would be a new class LinkedType
which receives the instance it has to link (Organization
for example) and on serialization it renders only the @id
instead of the whole object.
We will have to check for naming and so on to prevent conflicts. But would this help you? If so: I will happily accept a PR doing this. I also think that it should be integrated on type level and not on Graph level as you can also link entities in normal types and not only in graphs.
So this should also work:
$organization = Schema::organization()->identifier('https://mysite.com/#organization');
Schema::blogPosting( 'my_blog_post' )
->author($organization->linked())
->publisher($organization->linked());
Thank you for the quick reply.
This definitely could work for me, and I guess for most developers. It's super simple and straightforward!
I agree with you on having this done at type level.
I can help and I'll be happy to, but I'm afraid that this might be out of my league. If I could get some guidance and/or examples I could give it a try, I have some available time this week.
I can try doing it on my own If you don't mind a non-perfect and probably not-looking-good PR ๐
Hello again,
I think I have a working alternative. At least it's working on my end. Rather than creating a new Type/Class I've tweaked a little bit the BaseType Class:
abstract class BaseType implements Type, ArrayAccess, JsonSerializable
{
/** @var array */
protected $properties = [];
protected bool $linked = false;
I've added a tiny var that indicates if that object is meant to be linked or not.
public function toArray(): array
{
$this->serializeIdentifier();
$properties = $this->serializeProperty($this->getProperties());
return [ '@context' => $this->getContext() ]
+ ($this->linked ? [] : ['@type' => $this->getType()])
+ $properties;
}
I've tweaked a little bit the toArray()
method to only include the @type
property if the object is not linked.
public function setLinked(bool $linked)
{
$this->linked = $linked;
}
public function linked()
{
$class = get_class($this);
$linkedType = new $class();
$linkedType->identifier($this->getProperty('identifier'));
$linkedType->setLinked(true);
return $linkedType;
}
}
And finaly included a setter method for the bool variable and your proposed function that returns a copy of the current object with two differences:
- This object has one only property (identifier).
- This object hast the linked variable set to true.
I know this probably is not the prettiest way to handle this but let me know if this could work for you guys and I can send a PR.
Thanks!
This is my working code:
// Graph
$graph->add($webpage, 'my_webpage');
$graph->add($organization, 'my_organization');
$graph->organization('my_organization')
->mainEntityOfPage($webpage->linked())
->subjectOf($webpage->linked());
At first thanks for your working example! ๐
Some thoughts:
The benefit of a new class LinkedType
which has a reference to the original type means that the original type doesn't have to be finished the moment linked()
is called.
$graph = new Graph();
$graph->organization();
$graph->blogPosting()->author($graph->organization()->linked());
$graph->organization()->identifier('my-organization');
The new class will also allow to encapsulate the logic n a dedicated class instead of doing it in the base type.
The only disadvantage I see is that instanceof
checks don't work. And therefore the IDE will possibly complain as linked()
will always return a LinkedType
but this will possibly be solvable by adding a doc-tag like @return LinkedType|self
.
The new class/type should be something like:
class LinkedType implements Type
{
/** @var Type */
protected $type;
public function __construct($type)
{
$this->type = $type;
}
}
And implementing the methods required by the interface.
One advice: please add the class to https://github.com/spatie/schema-org/tree/master/generator/templates/static and use composer generate
to publish the file. And we can't accept a PR without a unittest/testcase.
Agree!
Thanks for the info. I would appreciate it if someone more experienced could take this one. If not I could give it a try next week. PHP is certainly not my expertise ๐
@jose-gomez-evinex can you check the PR.
https://github.com/spatie/schema-org/pull/155/files#diff-c10a6952ecaac53b8fd72acae582248c919bad0a44732be50aac2120be1d6fc9R215
I've changed the naming as the official word seems to be reference
instead of link
.
Looks great! Thanks!