minkphp/MinkGoutteDriver

GoutteDriver can't find any elements

kwisatz opened this issue · 5 comments

Hi, I a have a very strange behavior that I can't quite understand. Goutte (1.0.9) seems to be unable to find any elements on a page. I tried with the Selenium Driver, just to make sure it was not related to my markup:

  @javascript
  Scenario: Navigating to the webapp page                                 # features/01-navigating.feature:61
    Given I am on the homepage                                            # FeatureContext::iAmOnHomepage()
    When I follow "Web Applications"                                      # FeatureContext::clickLink()
    And the "li.subpage.active" element should contain "Web Applications" # FeatureContext::assertElementContains()

  Scenario: Navigating to the webapp page                                  # features/01-navigating.feature:66
    Given I am on the homepage                                             # FeatureContext::iAmOnHomepage()
    When I follow "Web Applications"                                       # FeatureContext::clickLink()
    Then the "li.subpage.active" element should contain "Web Applications" # FeatureContext::assertElementContains()
      Element matching css "li.subpage.active" not found.

      +--[ HTTP/1.1 200 | http://1024.local/services/web-application-development | GoutteDriver ]
      |
      |  <body>
      |  
      |          <!-- Header -->
      |          <header>
      |              <div class="container">
      |                  <div class="header row">
      |                      <div class="logo span4">
      |                          <h2><a href="/"><strong>Logo</strong><span class="light">.domain</span></a></h2>
      |                      </div>
      |                      <div class="span8">
      |                          <nav id="primary">
      |                              <ul class="nav nav-list pull-right">
      |                                                                  <li class="section-1">
      |                                      <a href="#ubiquitous">
      |                                                                                  <i class="icon-globe"></i>
      |                                                                          Ubiquitous</a>
      |                                  </li>
      |                                                                  <li class="section-2">
      |                                      <a href="#useability">
      |                            ...
      |
    And the response status code should be 200                             # FeatureContext::assertResponseStatus()

2 scenarios (1 passed, 1 failed)
7 steps (5 passed, 1 skipped, 1 failed)
0m3.604s

One can see that Selenium finds the element just fine, but goutte doesn't. I've tried with other elements (div.container for example) but that doesn't work either. Goutte then says it doesn't find the element, even though it's even listed in the debug output.

Any help greatly appreciated.

Ok, I'm making some progress in understanding why this is happening, and it turns out, another issue of mine is actually related:
When when using # FeatureContext::assertPageContainsText(), I get the following exception:

 exception 'InvalidArgumentException' with message 'The current node list is empty.' in behat/vendor/symfony/dom-crawler/Symfony/Component/DomCrawler/Crawler.php:483
      Stack trace:
      #0 behat/vendor/behat/mink-browserkit-driver/src/Behat/Mink/Driver/BrowserKitDriver.php(348): Symfony\Component\DomCrawler\Crawler->text()
      #1 behat/vendor/behat/mink/src/Behat/Mink/Element/Element.php(101): Behat\Mink\Driver\BrowserKitDriver->getText('//html')
      #2 behat/vendor/behat/mink/src/Behat/Mink/WebAssert.php(170): Behat\Mink\Element\Element->getText()
      #3 behat/vendor/behat/mink-extension/src/Behat/MinkExtension/Context/MinkContext.php(248): Behat\Mink\WebAssert->pageTextContains('Web Application...')
      #4 [internal function]: Behat\MinkExtension\Context\MinkContext->assertPageContainsText('Web Application...')
      #5 behat/vendor/behat/behat/src/Behat/Behat/Definition/Annotation/Definition.php(155): call_user_func_array(Array, Array)
      #6 behat/vendor/behat/behat/src/Behat/Behat/Tester/StepTester.php(157): Behat\Behat\Definition\Annotation\Definition->run(Object(FeatureContext))
      #7 behat/vendor/behat/behat/src/Behat/Behat/Tester/StepTester.php(126): Behat\Behat\Tester\StepTester->executeStepDefinition(Object(Behat\Gherkin\Node\ExampleStepNode), Object(Behat\Behat\Definition\Annotation\Then))
      #8 behat/vendor/behat/behat/src/Behat/Behat/Tester/StepTester.php(95): Behat\Behat\Tester\StepTester->executeStep(Object(Behat\Gherkin\Node\ExampleStepNode))
      #9 behat/vendor/behat/gherkin/src/Behat/Gherkin/Node/AbstractNode.php(42): Behat\Behat\Tester\StepTester->visit(Object(Behat\Gherkin\Node\ExampleStepNode))
      #10 behat/vendor/behat/behat/src/Behat/Behat/Tester/ScenarioTester.php(148): Behat\Gherkin\Node\AbstractNode->accept(Object(Behat\Behat\Tester\StepTester))
      #11 behat/vendor/behat/behat/src/Behat/Behat/Tester/OutlineTester.php(98): Behat\Behat\Tester\ScenarioTester->visitStep(Object(Behat\Gherkin\Node\StepNode), Object(Behat\Gherkin\Node\OutlineNode), Object(FeatureContext), Array, false)
      #12 behat/vendor/behat/behat/src/Behat/Behat/Tester/OutlineTester.php(56): Behat\Behat\Tester\OutlineTester->visitOutlineExample(Object(Behat\Gherkin\Node\OutlineNode), 0, Array)
      #13 behat/vendor/behat/gherkin/src/Behat/Gherkin/Node/AbstractNode.php(42): Behat\Behat\Tester\OutlineTester->visit(Object(Behat\Gherkin\Node\OutlineNode))
      #14 behat/vendor/behat/behat/src/Behat/Behat/Tester/FeatureTester.php(88): Behat\Gherkin\Node\AbstractNode->accept(Object(Behat\Behat\Tester\OutlineTester))
      #15 behat/vendor/behat/gherkin/src/Behat/Gherkin/Node/AbstractNode.php(42): Behat\Behat\Tester\FeatureTester->visit(Object(Behat\Gherkin\Node\FeatureNode))
      #16 behat/vendor/behat/behat/src/Behat/Behat/Console/Command/BehatCommand.php(150): Behat\Gherkin\Node\AbstractNode->accept(Object(Behat\Behat\Tester\FeatureTester))
      #17 behat/vendor/behat/behat/src/Behat/Behat/Console/Command/BehatCommand.php(128): Behat\Behat\Console\Command\BehatCommand->runFeatures(Object(Behat\Gherkin\Gherkin))
      #18 behat/vendor/symfony/console/Symfony/Component/Console/Command/Command.php(244): Behat\Behat\Console\Command\BehatCommand->execute(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
      #19 behat/vendor/symfony/console/Symfony/Component/Console/Application.php(899): Symfony\Component\Console\Command\Command->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
      #20 behat/vendor/symfony/console/Symfony/Component/Console/Application.php(191): Symfony\Component\Console\Application->doRunCommand(Object(Behat\Behat\Console\Command\BehatCommand), Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
      #21 behat/vendor/behat/behat/src/Behat/Behat/Console/BehatApplication.php(68): Symfony\Component\Console\Application->doRun(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
      #22 behat/vendor/symfony/console/Symfony/Component/Console/Application.php(121): Behat\Behat\Console\BehatApplication->doRun(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
      #23 behat/vendor/behat/behat/bin/behat(32): Symfony\Component\Console\Application->run()
      #24 {main}

However, when I manually dump the page content, like so:

     /**
     * @Then /^I dump the contents$/
     */
    public function iDumpTheContents()
    {
        print_r($this->getSession()->getPage()->getContent());
        //or print_r($this->getSession()->getDriver()->getContent());
        // The following produces "The current node list is empty."
        //print_r($this->getSession()->getDriver()->getText('//html'));
    }

I can see that the correct page is indeed being loaded. Also, repeating the same scenario, but using the Selenium2 driver this time, it can find the texts just fine.

Also, when not navigating on the page, but staying on the homepage, it can also find texts.
The difference being indeed, that on the homepage, the Crawler Object is populated, while on other pages, it is not. So I'm guessing there's an issue parsing the html somewhere:

[crawler:protected] => Symfony\Component\DomCrawler\Crawler Object
        (
            [uri:protected] => http://1024.local/
            [storage:SplObjectStorage:private] => Array
                (
                    [000000000041e4fe000000001f92df01] => Array
                        (
                            [obj] => DOMElement Object
                                (
                                    [tagName] => html
                                    [schemaTypeInfo] => 
                                    [nodeName] => html
                                    [nodeValue] => ...

versus:

    [crawler:protected] => Symfony\Component\DomCrawler\Crawler Object
        (
            [uri:protected] => http://1024.local/services/web-application-development
            [storage:SplObjectStorage:private] => Array
                (
                )

        )

Turns out this really is a bug. However, I'm not sure of what exactly.
The issue is that the static page generator I was using created html pages without extensions.

E.g. services/web-application-development

A browser displays these files just fine, however Goutte or the Symfony BrowserKit seem not to be able to handle them.

When using services/web-application-development.htm instead, it all works out fine.

stof commented

I suspect that your webserver is not sending the content type as text/html when serving these pages

I checked, and indeed, it is not sending any content-type headers at all.

Now, my "work-around", which is probably the correct way anyway is to have the generator create /index.htm files. That way I won't have to change any links.

stof commented

Closing as it is not a bug in the GoutteDriver. BrowserKit is creating a crawler only for HTML content so the webserver config would need to be fixed