semarglproject/semargl

RDFa Property Copying support

levkhomich opened this issue · 7 comments

At this moment feature still in editor's draft, so no hurry here.

scor commented

FYI, this editor's draft will become "Last Call" on January 31st, for a period of three weeks. You're welcome to provide feedback on this new feature if you like (during its implementation).

Thank you for information.

This feature looks well for me and has many use cases. The only thing affecting streaming parsers is possibility for rdfa:copy to be defined before corresponding rdfa:Pattern. This can greatly increase memory complexity (it's currently O(h) where h is height of XML tree).

scor commented

thanks for this quick feedback! pinging @gkellogg, @niklasl and @msporny who might be interested in this feedback.

I outlined an implementation for streaming processors to @manu a bit ago, he may be able to repeat that. But, essentially, if you keep all triples associate with and rdfa:Pattern and the rdfa:copy triple until the end of processing, you should have a fairly small amount of memory to iterate over after parsing is complete.

The rdfa:copy feature /could/ significantly increase memory complexity in some corner-case documents, but in most cases it won't result in too much more memory overhead and it is implementable in a streaming fashion. I plan to implement the feature in librdfa (a SAX-based streaming RDFa processor) whenever I get the time to do so.

No doubt this feature can be easily implemented.

I have just mentioned, that memory complexity will be function of a list size (and tend to a document size when list growing) in all cases when HTML page contains list of elements (proposed use case) with rdfa:copy referring to a rdfa:Pattern defined later. For sure provided benefits exceed overhead in such corner cases.

Also step 1 of paragraph 3.5.1 may need some explanation for cases of circular dependencies in rdfa:copy - rdfa:Pattern chains.

Implemented in c063c81