CI check - check external link for validity
nikophil opened this issue · 3 comments
this issue refers to the 2 of issue #10
Ability to check external link for validity (2xx status code). Sphinx has this ability.
@weaverryan Don't you think this will slow down a lot the process ? BTW, shouldn't it be the core parser that have to do this task ?
i've made some tests:
Without link status code check, execution time = 37s
With link status code check, execution time = 643s
Another thing, here is the list of all urls with a bad status code:
some of them are a bit confusing, because i don't get the same result when i'm visiting the website via my browser (ex: https://flex.symfony.com or http://redis.io/).
Some other are meant not to work (ex: http://localhost:8000/product or http://localhost:8000/lucky/number)
Here is the code i'm using to test that:
if ($this->isExternalUrl($url)) {
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 2);
curl_exec($ch);
$httpCode = (int) curl_getinfo($ch, CURLINFO_RESPONSE_CODE);
curl_close($ch);
if ($httpCode < 200 || $httpCode >= 300) {
dump("$url : $httpCode");
}
}@weaverryan Don't you think this will slow down a lot the process ? BTW, shouldn't it be the core parser that have to do this task ?
100% absolutely :). I should have given more background about how this is handled with Sphinx - they have a separate "command" for it - linktest. So, we run it "every now and then" to check our links - but definitely not part of the main process.
Another thing, here is the list of all urls with a bad status code:
some of them are a bit confusing, because i don't get the same result when i'm visiting the website via my browser (ex: https://flex.symfony.com or http://redis.io/).
flex.symfony.com is a 405 - that makes me wonder if your CURL code is making something other than a GET request. I think it's ok to bring in Guzzle or some other library to do this code - it would only be a dev-dependency - as we would only need this in CI or when we want to run this locally.
Some other are meant not to work (ex: http://localhost:8000/product or http://localhost:8000/lucky/number)
Yea, I wonder how something like http://symfony.com/schema/dic/services is being seen as a link? I'm guessing this is just embedded in some XML attribute? How are we finding the links?
i'm not sure, but i think the parsers automatically converts urls into links, perhaps with a regex...
example: http://%env(HOST)%/project
the only occurrence of this is in a .. configuration-block:: and if we check the generated html, there is a <a> inside the code block
closing this one in favor of
https://github.com/weaverryan/docs-builder/issues/8