Fix <img> handling in implied p-name
gRegorLove opened this issue · 0 comments
gRegorLove commented
The parsing spec was updated for handling <img>
in implied p-name parsing. src
should no longer be part of the implied p-name.
- else use the textContent of the .h-x for name after:
- dropping any nested
<script>
&<style>
elements;- replacing any nested
<img>
elements with their alt attribute, if present;
HTML:
<p class="h-card">My Name <img src="http://xyz" /></p>
Parsed:
"type": [
"h-card"
],
"properties": {
"name": [
"My Name http:\/\/xyz"
],
"photo": [
"http:\/\/xyz"
]
}
Expected:
"type": [
"h-card"
],
"properties": {
"photo": [
"http://xyz"
],
"name": [
"My Name"
]
}
Note that 0.4.3 does not have this issue but 0.4.4-alpha does. The new text parsing algorithm includes the src
because it was written against the spec before this update.
mf2py has a PR supporting this: microformats/mf2py#106