Incorrect handling of HTTP redirect responses
Malesio opened this issue · 2 comments
The feed validator class is failing to properly validate RSS files due to the W3C website dropping support for HTTP requests:
POSTing any kind of data to http://validator.w3.org/feed/check.cgi results in a 301 Moved Permanently
(even when sending a valid RSS feed file):
➜ ~ cat sample_feed.rss
<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0">
<channel>
<title>W3Schools Home Page</title>
<link>https://www.w3schools.com</link>
<description>Free web building tutorials</description>
<item>
<title>RSS Tutorial</title>
<link>https://www.w3schools.com/xml/xml_rss.asp</link>
<description>New RSS tutorial on W3Schools</description>
</item>
<item>
<title>XML Tutorial</title>
<link>https://www.w3schools.com/xml</link>
<description>New XML tutorial on W3Schools</description>
</item>
</channel>
</rss>
➜ ~ gem list w3c
*** LOCAL GEMS ***
w3c_validators (1.3.6)
➜ ~ irb
irb(main):001:0> require 'w3c_validators'
=> true
irb(main):002:0> v = W3CValidators::FeedValidator.new
=> #<W3CValidators::FeedValidator:0x000055c69221b598 @validator_uri=#<URI::HTTP http://validator.w3.org/feed/check.cgi>, @options={:proxy_host=>nil, :proxy_po...
irb(main):003:0> v.validate_file("sample_feed.rss")
Traceback (most recent call last):
11: from /usr/bin/irb:23:in `<main>'
10: from /usr/bin/irb:23:in `load'
9: from /usr/lib/ruby/gems/2.7.0/gems/irb-1.2.6/exe/irb:11:in `<top (required)>'
8: from (irb):3
7: from /var/lib/gems/2.7.0/gems/w3c_validators-1.3.6/lib/w3c_validators/feed_validator.rb:48:in `validate_file'
6: from /var/lib/gems/2.7.0/gems/w3c_validators-1.3.6/lib/w3c_validators/feed_validator.rb:33:in `validate_text'
5: from /var/lib/gems/2.7.0/gems/w3c_validators-1.3.6/lib/w3c_validators/feed_validator.rb:59:in `validate'
4: from /var/lib/gems/2.7.0/gems/w3c_validators-1.3.6/lib/w3c_validators/validator.rb:110:in `send_request'
3: from /var/lib/gems/2.7.0/gems/w3c_validators-1.3.6/lib/w3c_validators/validator.rb:113:in `send_request'
2: from /usr/lib/ruby/2.7.0/net/http/response.rb:133:in `value'
1: from /usr/lib/ruby/2.7.0/net/http/response.rb:124:in `error!'
Net::HTTPRetriableError (301 "Moved Permanently")
Upon further inspection, it became clear that the method send_request
is at fault:
-
options[:url]
containing the new URI is never used by the following call tosend_request
: it only uses@validator_uri
, which still contains the old URI subject to redirection. A simple fix would be to replace this line by something like@validator_uri = URI.parse(response['location'])
to update the instance variable.
I also noticed a probably poorly named variable:
-
- Semantics would have that
not following_redirect
become justfollowing_redirect
to actually follow redirects. The default boolean value used in the method signature would also need to becometrue
to default to following redirects, if that is the desired behaviour.
- Semantics would have that
@Malesio thanks for reporting this bug and for the analysis you have made.
Indeed the issue is related to the fact that feed validator is now only accessible through HTTPS and not HTTP.
I will publish a new version with the fix.
Regarding your last comment following_redirect
is just there to know if we are on a call linked to a redirect or not (to avoid a kind of infinite loop).
Version 1.3.7 published on Rubygems.org