manolomartinez/greg

Not following 302 redirects

battis opened this issue · 1 comments

It appears that greg isn't following 302 redirects in feeds. The particular case in point that I'm seeing is The Bugle. For the purposes of illustration, let's take the most reasonable, but also the most redirected version of the situation.

Subscribe to http://feeds.feedburner.com/thebuglefeed. All of the media attachments are actually hosted by Acast. For example, episode 4162 has a media enclosure link of https://feeds.acast.com/public/streams/5e7b777ba085cbe7192b0607/episodes/5f341b083120c8645db8bc9b.mp3, which generates an error from greg:

Downloading 4162 - Bond, Boris and Boats -- 5f341b083120c8645db8bc9b.mp3
... something went wrong. Are you connected to the internet?

I did some digging with curl, and I observe that Acast seems to be doing load-balancing of some sort, since this request:

curl -v https://feeds.acast.com/public/streams/5e7b777ba085cbe7192b0607/episodes/5f341b083120c8645db8bc9b.mp3

generates the following response

GET /public/streams/5e7b777ba085cbe7192b0607/episodes/5f341b083120c8645db8bc9b.mp3 HTTP/2
Host: feeds.acast.com
User-Agent: curl/7.54.0
Accept: /

  • Connection state changed (MAX_CONCURRENT_STREAMS updated)!
    < HTTP/2 302
    < content-length: 0
    < location: https://assets-do.pippa.io/shows/5e7b777ba085cbe7192b0607/1597249367402-89f22c0f843255c028c73b988db40a2b.mp3
    < access-control-allow-methods: GET,OPTIONS
    < access-control-allow-origin: *
    < cache-control: no-cache
    < date: Sat, 15 Aug 2020 14:31:08 GMT
    < server: nginx/1.14.0 (Ubuntu)
    < x-frame-options: sameorigin
    < x-request-id: Z+Blgq8rS/J6XdG3
    < x-cache: Miss from cloudfront
    < via: 1.1 980d2a1c9c4f90ad69118c6357f92882.cloudfront.net (CloudFront)
    < x-amz-cf-pop: BOS50-C1
    < x-amz-cf-id: CPTpFkQg9zvXkfwFfDm9QwBdC0QGz7I-L676xZnEcUilbsOnM-TsUA==
    <
  • Connection #0 to host feeds.acast.com left intact

When I follow the 302 redirect location, I do get the actual file.

Huh. So... while this is all technically true, a little more digging reveals that this is a great use case for the download handler option in greg.conf:

[The Bugle]
downloadhandler = wget {link} -P {directory}