weaverryan/symfony-docs

CI check - check external link for validity

nikophil opened this issue · 3 comments

this issue refers to the 2 of issue #10

Ability to check external link for validity (2xx status code). Sphinx has this ability.

@weaverryan Don't you think this will slow down a lot the process ? BTW, shouldn't it be the core parser that have to do this task ?

i've made some tests:
Without link status code check, execution time = 37s
With link status code check, execution time = 643s

Another thing, here is the list of all urls with a bad status code:
some of them are a bit confusing, because i don't get the same result when i'm visiting the website via my browser (ex: https://flex.symfony.com or http://redis.io/).
Some other are meant not to work (ex: http://localhost:8000/product or http://localhost:8000/lucky/number)

Here is the code i'm using to test that:

if ($this->isExternalUrl($url)) {
    $ch = curl_init($url);
    curl_setopt($ch, CURLOPT_HEADER, true);
    curl_setopt($ch, CURLOPT_NOBODY, true);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_TIMEOUT, 2);
    curl_exec($ch);
    $httpCode = (int) curl_getinfo($ch,  CURLINFO_RESPONSE_CODE);
    curl_close($ch);

    if ($httpCode < 200 || $httpCode >= 300) {
        dump("$url : $httpCode");
    }
}
url status code
https://flex.symfony.com 405
https://api.symfony.com/4.0/Symfony/Component/Console/Output/ConsoleSectionOutput.html 404
http://localhost:8000/product 0
http://localhost:8000/product/1 0
http://redis.io/ 404
https://tools.ietf.org/html/rfc7234 403
http://www.mnot.net/cache_docs/ 301
https://tools.ietf.org/html/rfc7234 403
https://tools.ietf.org/html/rfc7232 403
https://tools.ietf.org/html/draft-ietf-httpbis-p2-semantics-20#section-2.3.4 403
https://api.symfony.com/4.0/Symfony/Component/Messenger/Handler/MessageHandlerInterface.html 404
https://api.symfony.com/4.0/Symfony/Component/Messenger/Handler/MessageSubscriberInterface.html 404
https://api.symfony.com/4.0/Symfony/Component/DependencyInjection/ParameterBag/ContainerBagInterface.html 404
https://tools.ietf.org/html/rfc7540#section-8.2 403
https://tools.ietf.org/html/rfc5322 403
http://docs.doctrine-project.org/projects/doctrine-common/en/latest/reference/caching.html 404
http://stumptownsyndicate.org/code-of-conduct/reporting-guidelines/ 500
http://sphinx-doc.org/markup/ 404
http://tools.ietf.org/html/rfc2606#section-3 403
https://tools.ietf.org/wg/httpbis/ 403
https://drupal.org/ 403
http://rack.rubyforge.org/ 0
http://localhost:8000/lucky/number 0
https://flex.symfony.com 405
http://localhost:8000/lucky/number 0
http://localhost:8000/lucky/number/100 0
https://tools.ietf.org/html/rfc3339 403
https://tools.ietf.org/html/rfc3339 403
http://docs.doctrine-project.org/en/latest/reference/unitofwork-associations.html 404
https://api.symfony.com/4.0/Symfony/Component/HttpFoundation/File/Exception/CannotWriteFileException.html 404
https://api.symfony.com/4.0/Symfony/Component/HttpFoundation/File/Exception/ExtensionFileException.html 404
https://api.symfony.com/4.0/Symfony/Component/HttpFoundation/File/Exception/FormSizeFileException.html 404
https://api.symfony.com/4.0/Symfony/Component/HttpFoundation/File/Exception/IniSizeFileException.html 404
https://api.symfony.com/4.0/Symfony/Component/HttpFoundation/File/Exception/NoFileException.html 404
https://api.symfony.com/4.0/Symfony/Component/HttpFoundation/File/Exception/NoTmpDirFileException.html 404
https://api.symfony.com/4.0/Symfony/Component/HttpFoundation/File/Exception/PartialFileException.html 404
http://192.168.1.1:8080 0
moment/moment#2373 0
http://tools.ietf.org/html/rfc2616#section-13.10 403
http://tools.ietf.org/html/rfc2616#section-13.2 403
https://tools.ietf.org/html/rfc7234#section-4.2.1 403
https://tools.ietf.org/html/rfc2616#section-13.2 403
https://api.symfony.com/4.0/Symfony/Component/Messenger/Middleware/LoggingMiddleware.html 404
https://api.symfony.com/4.0/Symfony/Component/Messenger/Asynchronous/Middleware/SendMessageMiddleware.html 404
https://api.symfony.com/4.0/Symfony/Component/Messenger/Middleware/HandleMessageMiddleware.html 404
https://api.symfony.com/4.0/Symfony/Component/Messenger/Middleware/MiddlewareInterface.html 404
https://api.symfony.com/4.0/Symfony/Component/Messenger/Envelope.html 404
https://api.symfony.com/4.0/Symfony/Component/Messenger/Transport/Serialization/SerializerConfiguration.html 404
https://api.symfony.com/4.0/Symfony/Component/Messenger/Middleware/Configuration/ValidationConfiguration.html 404
https://api.symfony.com/4.0/Symfony/Component/Messenger/Asynchronous/Transport/ReceivedMessage.html 404
https://api.symfony.com/4.0/Symfony/Component/Messenger/EnvelopeAwareInterface.html 404
https://api.symfony.com/4.0/Symfony/Component/Messenger/EnvelopeAwareInterface.html 404
https://api.symfony.com/4.0/Symfony/Component/Messenger/Transport/Serialization/Serializer.html 404
https://api.symfony.com/4.0/Symfony/Component/Messenger/Transport/SenderInterface.html 404
https://api.symfony.com/4.0/Symfony/Component/Messenger/Asynchronous/Transport/ReceivedMessage.html 404
https://api.symfony.com/4.0/Symfony/Component/Messenger/Asynchronous/Middleware/SendMessageMiddleware.html 404
http://localhost:8000/lucky/number/100 0
http://localhost:8000/lucky/number 0
https://flex.symfony.com 405
http://localhost:8000/lucky/number 0
https://tools.ietf.org/html/rfc3339#section-5.8 403
https://tools.ietf.org/html/rfc7807 403
https://tools.ietf.org/html/rfc4180 403
https://api.symfony.com/4.0/Symfony/Component/Serializer/Mapping/ClassDiscriminatorResolverInterface.html 404
https://api.symfony.com/4.0/Symfony/Component/Serializer/Mapping/ClassDiscriminatorFromClassMetadata.html 404
https://api.symfony.com/4.0/Symfony/Component/Serializer/Normalizer/CacheableSupportsMethodInterface.html 404
https://api.symfony.com/4.0/Symfony/Component/Serializer/Normalizer/CacheableSupportsMethodInterface.html#method_hasCacheableSupportsMethod 404
https://api.symfony.com/4.0/Symfony/Component/Serializer/Normalizers/NormalizerInterface.html 404
https://api.symfony.com/4.0/Symfony/Component/Serializer/Normalizers/DenormalizerInterface.html 404
http://tools.ietf.org/html/rfc4122 403
http://php.net/manual/en/book.image.php 0
https://tools.ietf.org/html/rfc5988 403
https://tools.ietf.org/html/rfc7540#section-8.2 403
https://api.symfony.com/4.0/Symfony/Component/Workflow/Exception/TransitionException.html 404
https://api.symfony.com/4.0/Symfony/Component/HttpFoundation/Session/Storage/Handler/MigratingSessionHandler.html 404
https://api.symfony.com/4.0/Symfony/Component/HttpFoundation/Session/Storage/Handler/RedisSessionHandler.html 404
https://api.symfony.com/4.0/Symfony/Component/HttpFoundation/Session/Storage/Handler/MigratingSessionHandler.html 404
https://api.symfony.com/4.0/Symfony/Component/VarDumper/Dumper/ServerDumper.html 404
https://api.symfony.com/4.0/Symfony/Component/VarDumper/Dumper/ServerDumper.html 404
https://api.symfony.com/4.0/Symfony/Component/Cache/ChainAdapter.html#method_prune 404
https://redis.io/ 404
https://redis.io/ 404
http://php.net/manual/en/class.pdo.php 0
https://physics.nist.gov/cuu/Units/binary.html 0
https://api.symfony.com/4.0/Symfony/Component/HttpFoundation/HeaderUtils.html 404
https://api.symfony.com/4.0/Symfony/Bridge/PsrHttpMessage/HttpMessageFactoryInterface.html 404
https://api.symfony.com/4.0/Symfony/Bridge/PsrHttpMessage/HttpFoundationFactoryInterface.html 404
https://secure.php.net/manual/en/function.normalizer-is-normalized.php 404
http://symfony.com/schema/dic/services 404
http://symfony.com/schema/dic/symfony 404
http://symfony.com/schema/dic/services 404
http://symfony.com/schema/dic/symfony 404
http://symfony.com/schema/dic/services 404
http://symfony.com/schema/dic/services 404
http://symfony.com/schema/dic/symfony 404
http://%env(HOST)%/project 0
http://symfony.com/schema/dic/services 404
http://%env(HOST)%/project 0
http://%env(HOST)%/project 0
http://symfony.com/schema/dic/services 404
http://symfony.com/schema/dic/symfony 404
http://datatracker.ietf.org/wg/httpbis/ 302
http://localhost:8000/random/10 0
http://localhost:8000/random/10 0

@weaverryan Don't you think this will slow down a lot the process ? BTW, shouldn't it be the core parser that have to do this task ?

100% absolutely :). I should have given more background about how this is handled with Sphinx - they have a separate "command" for it - linktest. So, we run it "every now and then" to check our links - but definitely not part of the main process.

Another thing, here is the list of all urls with a bad status code:
some of them are a bit confusing, because i don't get the same result when i'm visiting the website via my browser (ex: https://flex.symfony.com or http://redis.io/).

flex.symfony.com is a 405 - that makes me wonder if your CURL code is making something other than a GET request. I think it's ok to bring in Guzzle or some other library to do this code - it would only be a dev-dependency - as we would only need this in CI or when we want to run this locally.

Some other are meant not to work (ex: http://localhost:8000/product or http://localhost:8000/lucky/number)

Yea, I wonder how something like http://symfony.com/schema/dic/services is being seen as a link? I'm guessing this is just embedded in some XML attribute? How are we finding the links?

i'm not sure, but i think the parsers automatically converts urls into links, perhaps with a regex...

example: http://%env(HOST)%/project
the only occurrence of this is in a .. configuration-block:: and if we check the generated html, there is a <a> inside the code block