kurtmckee/feedparser

NASDAQ RSS feed not parsed

dionmes opened this issue · 2 comments

The following RSS feed cannot be parsed.

https://www.nasdaq.com/feed/rssoutbound?category=Options

There is no result or timeout.

Checked the feed with https://validator.w3.org/feed/check.cgi?url=https%3A%2F%2Fwww.nasdaq.com%2Ffeed%2Frssoutbound%3Fcategory%3DOptions

It is a valid RSS feed although with some issues.

This isn't a feedparser bug. The nasdaq.com server is using the User-Agent header to tie up resources. It's not a good practice, but they're doing it anyway.

I recommend using the requests package to customize the user agent, then pass the result to feedparser.

import feedparser
import requests

response = requests.get(
    "https://www.nasdaq.com/feed/rssoutbound?category=Options",
    headers={"User-Agent": "whatever works"},
)
result = feedparser.parse(response.text)

If you want, you can change "whatever works" to other words and identify some keywords to avoid. For example, the request fails to exit if I include the word "github".

Awesome, thanks!