Reader::findFeedLinks() failing with protocol-relative URLs
Closed this issue · 0 comments
I've noticed a little bug on Zend\Feed\Reader\Reader::findFeedLinks method
:
When a website is providing the feed URL as a 'protocol-relative' URL, the final URI returned by Reader\FeedSet::absolutiseUri
method is in a invalid format.
Example:
The Engadget website feed URL is //www.engadget.com/rss.xml (is not started with http://, but with only //), so, when the Reader module prepare it to use, the URL returned is: http://www.engadget.com/www.engadget.com/rss.xml, and obviously when we try to read data from this URL, it fails.
<?php
$feedLinks = Zend\Feed\Reader\Reader::findFeedLinks('http://www.engadget.com');
echo $feedLinks->rss;
// Output: http://www.engadget.com/www.engadget.com/rss.xml
For now, I solved the problem by checking if the URL starts with // on Reader\FeedSet::absolutiseUri
and prepending the protocol if it don't exists:
src/Reader/FeedSet.php:65
...
protected function absolutiseUri($link, $uri = null)
{
/*
* Fix: check if $link is a protocol-relative URL and prepend protocol to it
*/
if (substr($link, 0, 2) == '//') {
$link = 'http:' . $link;
}
$linkUri = Uri::factory($link);
if (!$linkUri->isAbsolute() or !$linkUri->isValid()) {
if ($uri !== null) {
$uri = Uri::factory($uri);
...
Can you confirm this bug? Once confirmed, I can open a PR with this fix.
Thanks!