kurtmckee/feedparser

`itunes:summary` overwrites `description` field in feed items when parsing

neilius opened this issue · 0 comments

When a feed item entry has both an <itunes:summary> tag and a <description> tag, the <itunes:summary> tag takes precedence and overwrites whatever is present in the <description> tag, making it available at the summary key on the item's dict.

Example:

<?xml version="1.0" encoding="UTF-8"?>
<rss
  version="2.0"
  xmlns:content="http://purl.org/rss/1.0/modules/content/"
  xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
>
  <channel>
    <item>
      <title>A title</title>
      <description><![CDATA[<p>The description field</p>]]></description>
      <link>https://example.com</link>
      <content:encoded><![CDATA[<p>The content</p>]]></content:encoded>
      <itunes:summary>Itunes summary</itunes:summary>
    </item>
  </channel>
</rss>

Parsing the above with parse(), the summary for the item entry is set to the value in <itunes:summary>:

>> parsed_feed = feedparser.parse("the-above-feed.xml")
>> parsed_feed.entries[0].summary == 'Itunes summary'
True

My expectation is that the <itunes:summary> value would be available at the itunes_summary key, much like the other values in the iTunes namespace and the <description> tag's value would be available at summary as outlined in the documentation. Instead the iTunes summary is given precedence as shown above and applied to the summary key. Even when the <itunes:summary> is an empty tag, I still get an empty string as opposed to the value from the <description> field.

This seems to be very similar to both #314 and #316. Is this expected behavior or is this a bug?