kurtmckee/feedparser

FR: With items with no link but an enclosure with a href, use that href for entries[i].link

melyux opened this issue · 1 comments

For example, there is a popular podcast website Buzzsprout that uses Atom for its feeds. Each item has no link but only an enclosure.

Here's an example feed:

...
  <item>
    <itunes:title>With Nick van der Kolk from Love + Radio #17</itunes:title>
    <title>With Nick van der Kolk from Love + Radio #17</title>
    <description>...</description>
    <content:encoded>...</content:encoded>
    <itunes:author>Andy Clark en Richard den Haring</itunes:author>
    <enclosure url="https://www.buzzsprout.com/99850/2968933-with-nick-van-der-kolk-from-love-radio-17.mp3" length="20004212" type="audio/mpeg" />
    <guid isPermaLink="false">Buzzsprout-2968933</guid>
    <pubDate>Tue, 10 Mar 2020 02:00:00 -0400</pubDate>
    <itunes:duration>1665</itunes:duration>
    <itunes:keywords></itunes:keywords>
    <itunes:episodeType>bonus</itunes:episodeType>
    <itunes:explicit>false</itunes:explicit>
  </item>
...

Since there is no link but a single enclosure with a URL, this can be used as the definitive entries[i].link for this entry. This would parallel how both the <link> and enclosures are mixed together to produce entries[i].links.

If there is more than 1 enclosure with a URL, then we can fall back to the current thing since we have no way of knowing which URL would be the primary. But in this case with a single URL, it's clear.

P.S. Saw that <link> becomes an entries[i].links element with rel set to alternate even though the link doesn't have that rel set. Not familiar with the Atom spec too well, is this how it's supposed to be? Or should it default to another rel type like self?