Elfeed unable to get creator information from ACM Journal feeds
swflint opened this issue · 2 comments
swflint commented
Consider the feed https://dl.acm.org/action/showFeed?type=etoc&feed=rss&jc=pacmpl
,
This feed provides creator/author information but elfeed does not seem to parse it.
skeeto commented
I wondered why that might be, but as soon as I saw the first tag I knew
the reason: This is an RSS 1.0 or "RDF" feed. This is the worst version of
RSS, and that's saying something. Such feeds tend to have the most broken
and incomplete semantic structure. After all, if they knew better they
wouldn't be using RSS 1.0. Because of this trend, I decided not to bother
extracting extra information from such feeds.
Case in point: This feed from a professional computing society has worse
structure than a typical Harry Potter fan fiction blog feed. Your interest
is the Dublin Core "creator" element, which may be used multiple times to
express a list of creators, but ACM decided to ambiguously concatenate all
the authors' names into one element. Had Elfeed extracted this data, it
would be gobbledygook since it lacks structure. The empty, self-closing
copyright element is a nice touch, too.
I can't complain *too* much. The titles, dates, and links are in perfect
order, and these are the important elements.
swflint commented
Cool. I'll send them an email. Been thinking about doing that to see if they'd add abstract too.