aarongustafson/jekyll-webmention_io

Tweet that contains what looks to Markdown like an ordered list generates flawed HTML

Closed this issue · 3 comments

dltj commented

This is an interesting edge case. There is a tweet that references my blog post that contains a character sequence that corresponds to Markdown markup for emphasis (e.g. the *any* part of the tweet):

5. My point being not that major releases are often handled badly, and are disastrous for that reason; but the much more fundamental point that *any* major release is a confession of failure. Failure to keep the promise of compatibility, which the poor user depends on.

— Mike Tⓐylor 🏴󠁧󠁢󠁥󠁮󠁧󠁿 🇬🇧 🇪🇺 (@MikeTaylor) June 25, 2021

What is stored in my received.yml file is this:

  '1408449494241493000':
    id: '1408449494241493000'
    uri: https://twitter.com/MikeTaylor/status/1408449494241493000
    source: twitter
    pubdate: 2021-06-25 11:38:21.000000000 -04:00
    raw:
      source: https://brid.gy/comment/twitter/DataG/1407879695811723267/1408449494241493000
      verified: true
      verified_date: '2021-06-25T16:41:38+00:00'
      id: 1201321
      private: false
      data:
        author: &16
          name: "Mike Tⓐylor \U0001F3F4\U000E0067\U000E0062\U000E0065\U000E006E\U000E0067\U000E007F
            \U0001F1EC\U0001F1E7 \U0001F1EA\U0001F1FA"
          url: https://twitter.com/MikeTaylor
          photo: https://webmention.io/avatar/pbs.twimg.com/f925d364e8cedcdfc50f4c883ea366fea1f25d22e4660cd7bd66f686c488514c.jpg
        url: https://twitter.com/MikeTaylor/status/1408449494241493000
        name:
        content: |-
          5. My point being not that major releases are often handled badly, and are disastrous for that reason; but the much more fundamental point that *any* major release is a confession of failure. Failure to keep the promise of compatibility, which the poor user depends on.
          <a class="u-mention" href="https://dltj.org/"></a>
          <a class="u-mention" href="https://twitter.com/DataG"></a>
          <a class="u-mention" href="https://twitter.com/bryjbrown"></a>
          <a class="u-mention" href="https://twitter.com/ruthbrarian"></a>
          <a class="u-mention" href="https://www.mypronouns.org/he-him"></a>
        published: '2021-06-25T15:38:21+00:00'
        published_ts: 1624635501
      activity:
        type: reply
        sentence: "Mike Tⓐylor \U0001F3F4\U000E0067\U000E0062\U000E0065\U000E006E\U000E0067\U000E007F
          \U0001F1EC\U0001F1E7 \U0001F1EA\U0001F1FA replied '5. My point being not
          that major releases are often handled badly, and are disas...' to a tweet
          https://dltj.org/article/digital-preservation-software/"
        sentence_html: "<a href=\"https://twitter.com/MikeTaylor\">Mike Tⓐylor \U0001F3F4\U000E0067\U000E0062\U000E0065\U000E006E\U000E0067\U000E007F
          \U0001F1EC\U0001F1E7 \U0001F1EA\U0001F1FA</a> replied '5. My point being
          not that major releases are often handled badly, and are disas...' to a
          tweet <a href=\"https://dltj.org/article/digital-preservation-software/\">https://dltj.org/article/digital-preservation-software/</a>"
      target: https://dltj.org/article/digital-preservation-software/
    author: *16
    type: reply
    content: |-
      <p>
        <li>My point being not that major releases are often handled badly, and are disastrous for that reason; but the much more fundamental point that <em>any</em> major release is a confession of failure. Failure to keep the promise of compatibility, which the poor user depends on.
      <a class="u-mention" href="https://dltj.org/"></p>
      <a class="u-mention" href="https://twitter.com/DataG"></a>
      <a class="u-mention" href="https://twitter.com/bryjbrown"></a>
      <a class="u-mention" href="https://twitter.com/ruthbrarian"></a>
      <a class="u-mention" href="https://www.mypronouns.org/he-him"></a></li>
      </ol>

Note the third line of the content tag that has a closing </p> tag rather than a closing </a> tag:

<a class="u-mention" href="https://dltj.org/"></p>

This, of course, is really messing with how the browser is rendering the HTML markup.

Two points that I would mark as "unfortunate":

  1. I can't point to a live version. I'm just getting started with Webmentions and jekyll-webmention_io, and I haven't deployed it to my live site yet.
  2. I don't know Ruby as a programming language, so I don't know where to begin diagnosing the problem.
dltj commented

So I've been digging further, and I found lines 58-60 of lib/jekyll/webmention_io/webmention_item.rb as the source of the problem. kramdown (the Markdown conversion tool I have configured in my Jekyll configuration) is converting the text of that tweet into an <li> because the string starts with 5.. Lines 58-60 are munging part of the <li> tag into a paragraph tag. The Git Blame for those three lines doesn't offer clues as to what those lines are doing. Do you remember why you put that code in there?

I kept having Ruby/Jekyll issues when upgrading my Mac, so I have moved off of Jekyll and will not be working on this project anymore, going forward. I am going to flag this as won’t fix, but leave it open in case someone else wants to pick up the project from here.

dltj commented

Hello Aaron,

Ruby issues on MacOS? That sounds familiar. Thanks for your efforts in getting the Webmentions plugin this far...it is a good piece of work. (Embedding then rendering the templates in JavaScript for the realtime updates is inspired.) I wish I knew Ruby—if I did, I'd likely volunteer to pick this up. Best of luck with your future endeavors.