(votes, committee_meetings): senate.gov and clerk.house.gov not redirecting to https
ryparker opened this issue · 0 comments
ryparker commented
Problem
When running tasks for votes or committee_meetings then the requests to download from senate.gov and clerk.house.gov fail to redirect to https and timeout/fail.
Cause
Requests to http://senate.gov
or http://clerk.house.gov
respond with 301 redirecting to the https://
version and It looks like the request library (scrapelib) is not configured to following redirects.
PR here: #285
Reproduce
Run the votes
or committee_meetings
tasks or you can verify the redirect:
$ curl -i http://senate.gov
HTTP/1.1 301 Moved Permanently
Server: AkamaiGHost
Content-Length: 0
Location: http://www.senate.gov/
Date: Thu, 19 May 2022 03:14:17 GMT
Connection: keep-alive
$curl -i http://clerk.house.gov
HTTP/1.1 301 Moved Permanently
Content-Type: text/html; charset=UTF-8
Location: https://clerk.house.gov/
Vary: Accept-Encoding, Cookie
X-Xss-Protection: 1;mode=block
Strict-Transport-Security: max-age=0;
X-Frame-Options: DENY
X-Content-Type-Options: nosniff
X-Permitted-Cross-Domain-Policies: none
Referrer-Policy: no-referrer
Content-Security-Policy: …
Date: Thu, 19 May 2022 03:15:26 GMT
Content-Length: 147
<head><title>Document Moved</title></head>
<body><h1>Object Moved</h1>This document may be found <a HREF="https://clerk.house.gov/">here</a></body>%