microformats/php-mf2

Fix <img> handling in implied p-name

gRegorLove opened this issue · 0 comments

The parsing spec was updated for handling <img> in implied p-name parsing. src should no longer be part of the implied p-name.

  • else use the textContent of the .h-x for name after:
    • dropping any nested <script> & <style> elements;
    • replacing any nested <img> elements with their alt attribute, if present;

HTML:

<p class="h-card">My Name <img src="http://xyz" /></p>

Parsed:

"type": [
    "h-card"
],
"properties": {
    "name": [
        "My Name http:\/\/xyz"
    ],
    "photo": [
        "http:\/\/xyz"
    ]
}

Expected:

"type": [
    "h-card"
], 
"properties": {
    "photo": [
        "http://xyz"
    ], 
    "name": [
        "My Name"
    ]
}

Note that 0.4.3 does not have this issue but 0.4.4-alpha does. The new text parsing algorithm includes the src because it was written against the spec before this update.

mf2py has a PR supporting this: microformats/mf2py#106