pfefferle/wordpress-webmention

<br /> not converted to \n from Bridgy

Closed this issue · 11 comments

edent commented

This comment - https://brid.gy/comment/mastodon/@edent@mastodon.social/110491395940578642/110495628806946433 contains the following:

<a href="https://mastodon.social/@dracos" class="u-url mention">@<span>dracos</span></a></span><br />Lovely!</p>

But it appears in my comments without the <br /> nor is it converted to a \n

Screenshot of the comment.

We do have a sanitize function, but it includes
. https://github.com/pfefferle/wordpress-webmention/blob/main/includes/functions.php#L382

Your link shows rel="no-follow ugc", so let's go through all the things WordPress does to filter comment content....doing this as an exercise to decide which ones we may wish to override.

Both of these are hooked to pre_comment_content, when things are added to the database.

  • wp_rel_ugc - Adds rel="nofollow ugc" string to all HTML A elements in content.
  • wp_filter_post_kses - Runs it through the post filter, rather than our custom filter.

Considering we filter, I don't think there's an issue with removing the default WordPress filtering. but it also allows br.

@pfefferle Thoughts here?

I've been seeing this for a while now too, on pre-merge Webmention (ie 4.x) and Semantic Linkbacks plugins, and it's not Bridgy specific. Here's an example from today: https://kandr3s.co/responses/2023-06-07-yiozj => https://snarfed.org/2023-06-02_bridgy-stats-update-8#comment-2864421 .

image image

On a related note, I'd encourage you all to think about this in terms of "how do we preserve the original reply's whitespace and formatting intact?" instead of "which tags do we sanitize or not?" Whitespace handling is (maybe obviously) both surprisingly difficult and surprisingly important, I've spent way more time on it in Bridgy etc than I ever expected to.

I think I was starting with should we stop letting WordPress sanitize it after we have? That's a double cleanup.

I'm guessing that what happens is this: https://github.com/WordPress/wordpress-develop/blob/ba9a2f8b83211fbdee1621fd4f39dcb11908b817/src/wp-includes/kses.php#L2202. Or this: https://github.com/WordPress/wordpress-develop/blob/ba9a2f8b83211fbdee1621fd4f39dcb11908b817/src/wp-includes/comment.php#L3618. :-)

Either way, it looks like, because for webmention comments, there is no logged-in user, the filter applied by core is not the post one but the default one. (And br is not in the default allowed tags: https://github.com/WordPress/wordpress-develop/blob/ba9a2f8b83211fbdee1621fd4f39dcb11908b817/src/wp-includes/kses.php#L391, while it is in the allowed post tags.)

edent commented

So, in theory, I can add the following to my theme:

function allow_tags() {
	global $allowedtags;
		$new_tags = array(
			'br'    => array()
		);
	$allowedtags = array_merge( $allowedtags, $new_tags );
}
add_action( 'init', 'allow_tags', 11 );

And that should allow WebMentions to have <br> elements?

I already use this function to allow pre, code, etc.

I'd just suggest we remove kses_post when we add a Webmention entirely because we do our own filtering. I think I'll write that PR because it is just doing extra work.

@edent Did we address this with #408

edent commented

@dshanske looks like. But I've altered my theme to explicitly allow <br>

Then I'll close until someone else comments.

Confirmed, looks like this is fixed in 5.1.4, probably actuallly sometime before that. I tried with the HTML in the original description, and it's now correctly rendered with a newline. Thanks guys!