ruby/rss

Parsing enc:enclosure

Opened this issue · 2 comments

I was attempting to parse an RSS doc with

<enc:enclosure resource="http://image_url" type="image/jpeg"/>
<item rdf:about="https://dallas.craigslist.org/dal/cto/d/mckinney-2003-acura-cl-type/7109117586.html">
<title><![CDATA[2003 ACURA CL TYPE-S (MCKINNEY) &#x0024;850]]></title>
<link>https://dallas.craigslist.org/dal/cto/d/mckinney-2003-acura-cl-type/7109117586.html</link>
<description><![CDATA[SELLING MY BELOVED HONDA FOR PARTS OR PERSONAL PROJECT. ENGINE IN GREAT SHAPE HAS 165,000 ML MAINTATINED REALLY WELL, CLEAN TITLE. BRAND NEW FRONT SUSPENSION, GOOD BREMBOO BRAKES, TIRES IN GOOD SHAPE, EVERYTHING WORKS INSIDE CAR, FRONT SEAT BIT TORN. ...]]></description>
<dc:date>2020-04-16T10:51:41-05:00</dc:date>
<dc:language>en-us</dc:language>
<dc:rights>copyright 2020 craigslist</dc:rights>
<dc:source>https://dallas.craigslist.org/dal/cto/d/mckinney-2003-acura-cl-type/7109117586.html</dc:source>
<dc:title><![CDATA[2003 ACURA CL TYPE-S (MCKINNEY) &#x0024;850]]></dc:title>
<dc:type>text</dc:type>
<enc:enclosure resource="https://images.craigslist.org/00h0h_fyqO7icY0vm_300x300.jpg" type="image/jpeg"/>
<dcterms:issued>2020-04-16T10:51:41-05:00</dcterms:issued>
</item>

Even with passing ignore_unknown_element to the parse method, I could not find these in the parsed results. Does this library not support these enclosures, and if not is there a plan/willingness to have it added? Enclosures seem to be a standard RSS feature: https://en.wikipedia.org/wiki/RSS_enclosure

kou commented

Could you provided a full RSS?

Sure! Just visit this link for example: https://newyork.craigslist.org/search/cta?format=rss