Tags: text that looks like a hashtag in code blocks and link text should not be added to a post
Closed this issue · 3 comments
Currently o2 is zealous in grabbing all tags in #hashtag
format and adding them as post tags in the standard "post" post type taxonomy.
I'm pretty sure — could be wrong — that the design intent was to ignore anything that could be used in a code snippet, link href or link text, or pre-formatted HTML.
Evidence here:
Line 277 in 609e3f5
This doesn't seem to be working, though. :/
Steps to repeat
Start a new post, and use post content such as:
This one is in a link: <a href="https://www.google.com/">#tag-google</a>.
This one is in a code block: <code>#tag-code-what</code>
This one is plain text: #tag-plain
Publish the post. Then look at the post tags added in (verify in wp-admin post editor screen).
What I expected
I'd expect only #tag-plain
to be added to the post as a WordPress post tag.
What happened instead
All three items that are vaguely hashtag-looking are added as post tags.
Screenshots
The same thing happens in comment text.
What does appear to work is if the tag is contained inside the [code] ... [/code]
shortcode. Those tags are not added to the post
Wow, that's been broken for a long time.
It's caused by the htmlentities()
call here, which is obviously incorrect in retrospect. :-)
It was originally added to avoid warnings that DOMDocument
would raise on some HTML. I'm inclined to add LIBXML_NOWARNING | LIBXML_NOERROR
to the loadHTML()
call when WP_DEBUG
is disabled. I don't recall the old behaviour causing data loss, just irrelevant messages.