Fix backcompat hfeed parsing
gRegorLove opened this issue · 1 comments
Using the example in microformats/microformats2-parsing#11 (comment), running just the hfeed
element through the parser seems to incorrectly parse the entry-title
and entry-content
without an intervening hentry
:
<div id="page" class="hfeed site wrap">
<h1 class="entry-title"><span class='p-name'>title</span></h1>
other content
<div class="entry-content">
<div class="e-content">this is a test for indieweb post </div> <span class="syn-text">Also on:</span>
<!--syndication links -->
</div>
</div>
currently parses as:
{
"items": [
{
"type": [
"h-feed"
],
"properties": {
"name": [
"title"
],
"content": [
{
"html": "this is a test for indieweb post ",
"value": "this is a test for indieweb post"
}
]
}
}
],
"rels": {},
"debug": {
"package": "https://packagist.org/packages/mf2/mf2",
"version": "v0.3.2",
"note": [
"This output was generated from the php-mf2 library available at https://github.com/indieweb/php-mf2",
"Please file any issues with the parser at https://github.com/indieweb/php-mf2/issues"
]
}
}
I would expect:
- The e-content to be ignored
- entry-title and entry-content to be ignored
- no implied property parsing
- http://microformats.org/wiki/microformats2-parsing##Imply+properties+only
-
Imply properties only on explicit h-x class name root microformat element (no backcompat roots):
So properties
should be empty in the parsed result.
Looking at this more closely, it's an issue I ran into while improving the backcompat parsing (#111). Ideally, the parser needs to distinguish between 1) mf2 properties that were explicitly authored inside mf1 roots and 2) mf1 properties that have been upgraded to mf2.
Currently php-mf2 doesn't do that. After running the backcompat algorithm and finding no hfeed
properties to upgrade, it adds the h-feed
class to the hfeed
root and continues to parse it as mf2. Thus it parses the p-name
and e-content
even though it shouldn't. It is not aware of whether those elements were upgraded or authored that way.
I punted on it at the time because it seems complex to solve and at the time I was not aware of examples of it causing issues.