Bad redirection (301) problem
yPhil-gh opened this issue · 1 comments
Hi, I'm having a problem with a feed URL that is not redirecting properly
URL of the feed: https://www.h24info.ma/feed
FeedParser version: 2.2.10
Node version: 14.15.3
NPM version: 7.6.0
There is clearly a problem with the URL itself, as it makes wget
loop forever:
# wget https://www.h24info.ma/feed
--2021-03-20 12:19:18-- https://www.h24info.ma/feed
Resolving www.h24info.ma (www.h24info.ma)... 104.21.39.120, 172.67.145.67, 2606:4700:3031::ac43:9143, ...
Connecting to www.h24info.ma (www.h24info.ma)|104.21.39.120|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://www.h24info.ma/feed/ [following]
--2021-03-20 12:19:18-- https://www.h24info.ma/feed/
Reusing existing connection to www.h24info.ma:443.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/rss+xml]
Saving to: ‘feed.2’
feed.2 [ <=> ] 12.09K --.-KB/s in 0.02s
2021-03-20 12:19:24 (518 KB/s) - Read error at byte 12383 (Success).Retrying.
--2021-03-20 12:19:25-- (try: 2) https://www.h24info.ma/feed/
Connecting to www.h24info.ma (www.h24info.ma)|104.21.39.120|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/rss+xml]
Saving to: ‘feed.2’
feed.2 [ <=> ] 12.09K --.-KB/s in 0.02s
2021-03-20 12:19:31 (501 KB/s) - Read error at byte 12383 (Success).Retrying.
--2021-03-20 12:19:33-- (try: 3) https://www.h24info.ma/feed/
Connecting to www.h24info.ma (www.h24info.ma)|104.21.39.120|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/rss+xml]
Saving to: ‘feed.2’
(...) repeated infinitely
Here is the (entire) curl response:
# curl -I https://www.h24info.ma/feed
HTTP/2 301
date: Sat, 20 Mar 2021 11:20:08 GMT
content-type: application/rss+xml; charset=UTF-8
set-cookie: __cfduid=ddff178f84ae3564a19a506e6770817e81616239208; expires=Mon, 19-Apr-21 11:20:08 GMT; path=/; domain=.h24info.ma; HttpOnly; SameSite=Lax
vary: Accept-Encoding,Cookie,User-Agent
x-redirect-by: WordPress
last-modified: Sat, 20 Mar 2021 10:48:51 GMT
location: https://www.h24info.ma/feed/
cache-control: max-age=2592000
expires: Mon, 19 Apr 2021 11:20:08 GMT
cf-cache-status: DYNAMIC
cf-request-id: 08f0f63ea90000ff644a9e1000000001
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report?s=ff1ZDtKFUu4zqFEwajoLi4qyf3zhwxqMolxSZGzB1PKoGRPveffiANXBvbAaRi2gaO4aucB0RYP31Yen78VCCe4FeAmvVuuRiPwbwLxHNQ%3D%3D"}],"max_age":604800,"group":"cf-nel"}
nel: {"max_age":604800,"report_to":"cf-nel"}
server: cloudflare
cf-ray: 632e8caaafa5ff64-MAD
alt-svc: h3-27=":443"; ma=86400, h3-28=":443"; ma=86400, h3-29=":443"; ma=86400
My code:
function getFeed (feedUrl, callback) {
// Get a response stream
fetch(feedUrl, {
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36',
'accept': 'text/html,application/xhtml+xml'
}).then(function (res) {
console.error('node-fetch status: %s', res.status);
// Setup feedparser stream
var feedparser = new FeedParser();
var feedItems = [];
feedparser.on('error', function() {
console.error('## RRfeedUrl: %s (%s)', feedUrl, res.status);
return callback('Unknown error');
});
feedparser.on('end', done);
feedparser.on('readable', function() {
try {
var item = this.read();
if (item !== null) feedItems.push (item);
}
catch (err) {
console.error('## ERR (%s)', err.message);
}
}).on ('end', function () {
var meta = this.meta;
return callback (null, feedItems, meta.title, meta.link);
});
if (res.status != 200) {
return callback(res.status);
}
var charset = getParams(res.headers.get('content-type') || '').charset;
var responseStream = res.body;
responseStream = maybeTranslate(responseStream, charset);
responseStream.pipe(feedparser);
}).catch((err) => {
console.error('## ERR (%s)', err.message);
return callback(err);
});
}
Finer inspection reveals a pending promise ; How can I resolve it and move on?
I tried using redirect: 'manual'
in the fetch call, but it makes a lot of otherwise fine feed URLs fail ; I'm really not sure if I have to track and handle those redirect problems at the fetch
level, or at the feedParser
one, since it follows redirects... Is there a way to detect that the redirect fails?
Sorry, but this is not an issue with Feedparser. At first glance, it just looks like the remote server is incorrectly configured. You'll have to decide how you want to manage that.