kurtmckee/feedparser

Incorrect parsing of subelements

Animus-Surge opened this issue · 1 comments

Hello,

I have noticed that some keys are empty in the returned dict of parse. Certain fields in the rss feed I'm using are deliberately blank, but the ones I'm focusing on are when fields have subelements. The subelement gets placed in the root of the entry instead of within an object as the value of the parent element. In addition, when different elements have the same subelement, the subelement's value gets overwritten by each subsequent instance of the subelement.

To give an example, lets say I have an entry that looks something like this:

<element1>
  <subelement>Hi</subelement>
</element1>
<!--...-->
<element2>
  <subelement>Hello there</subelement>
</element2>

I would expect parse to return a dict that looks something like this:

{
  //...
  "element1":{
    "subelement":"Hi"
  },
  "element2":{
    "subelement":"Hello there"
  }
}

However, it seems like I get something like this instead:

{
  //...
  "subelement":"Hello there", //"Hi" gets overwritten with "Hello there"
  "element1":"",
  "element2":""
}

minor edit: Using feedparser version 6.0.10 and python 3.10.0

I am having the same problem, any solution???