stevepryde/thirtyfour

caps.set_headless not working with firefox/geckodriver/standalone grid

bcpeinhardt opened this issue · 13 comments

TLDR: This is an issue with how Selenium Grid handles desiredCapabilities and will soon be removed from this repo. I only have it here still because I'm not sure where, if anywhere, to put it yet. Hopefully I can remove it soon and contribute some helpful documentation around running headless firefox on selenium grid via the thrityfour or fantoccini crates.

It seems like caps.set_headless()?; is not working properly for Selenium Grid 4 and Firefox. To reproduce the issue, run Selenium Grid 4 in standalone mode, then run this slightly tweaked version of the example code. Also you'll need Firefox installed and geckodriver (which conveniently enough you can install with cargo install geckodriver) in your path. Note that the port was changed to match the default for selenium grid.

use thirtyfour::prelude::*;

#[tokio::main]
async fn main() -> WebDriverResult<()> {
     let mut caps = DesiredCapabilities::firefox();
     caps.set_headless()?;
     let driver = WebDriver::new("http://localhost:4444", caps).await?;

     // Navigate to https://wikipedia.org.
     driver.goto("https://wikipedia.org").await?;
     let elem_form = driver.find(By::Id("search-form")).await?;

     // Find element from element.
     let elem_text = elem_form.find(By::Id("searchInput")).await?;

     // Type in the search terms.
     elem_text.send_keys("selenium").await?;

     // Click the search button.
     let elem_button = elem_form.find(By::Css("button[type='submit']")).await?;
     elem_button.click().await?;

     // Look for header to implicitly wait for the page to load.
     driver.find(By::ClassName("firstHeading")).await?;
     assert_eq!(driver.title().await?, "Selenium - Wikipedia");
    
     // Always explicitly close the browser.
     driver.quit().await?;

     Ok(())
}

Oh and for the record, this is specifically an issue when running against Selenium Grid, headless works fine for an independently running geckodriver instance.

Update: running standalone grid with --reject-unsupported-caps true does not cause any change

Update: The exact same behavior seems to occur with fantoccini. The following code

use fantoccini::{Client, Locator, ClientBuilder};

// let's set up the sequence of steps we want the browser to take
#[tokio::main]
async fn main() -> Result<(), fantoccini::error::CmdError> {
    let c = ClientBuilder::native()
        .capabilities(serde_json::from_str(r#"{"moz:firefoxOptions": {"args": ["--headless"]}}"#)?)
        .connect("http://localhost:4444")
        .await.unwrap();

    // first, go to the Wikipedia page for Foobar
    c.goto("https://en.wikipedia.org/wiki/Foobar").await?;
    let url = c.current_url().await?;
    assert_eq!(url.as_ref(), "https://en.wikipedia.org/wiki/Foobar");

    // click "Foo (disambiguation)"
    c.find(Locator::Css(".mw-disambig")).await?.click().await?;

    // click "Foo Lake"
    c.find(Locator::LinkText("Foo Lake")).await?.click().await?;

    let url = c.current_url().await?;
    assert_eq!(url.as_ref(), "https://en.wikipedia.org/wiki/Foo_Lake");

    c.close().await
}

produces a visible browser when run against a selenium grid standalone but runs headless against a single geckodriver.

Update: found a rather gruff response from a Selenium maintainer suggesting that we should be setting aside DesiredCapabilities and using FirefoxOptions: SeleniumHQ/selenium#9472. Currently going down the rabbit hole of figuring out how this relates to the webdriver crate.

Update: Selenium grid distributor doesn't seem to care about our firefox options? Here's the log

17:43:04.136 INFO [LocalNode.newSession] - Session created by the Node. Id: 8d24332b-e98e-435c-825a-e0bf8a865981, 
Caps: Capabilities {
    acceptInsecureCerts: false, 
    br.31.0, 
    moz:headless: false, 
    moz:platformVersion: 10.0, 
    moz:processID: 18992, 
    moz:profile: C:\Users\BEN~1.PEI\AppData\..., 
    moz:shutdownTimeout: 60000, 
    moz:useNonSpecCompliantPointerOrigin: false, 
    moz:webdriverClick: true, 
    moz:windowless: false, 
    pageLoadStrategy: normal, 
    platformName: WINDOWS, proxy: Proxy(), 
    setWindowRect: true, 
    strictFileInteractability: false, 
    timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, 
    unhandledPromptBehavior: dismiss and notify
}
17:43:04.148 INFO [LocalDistributor.newSession] - Session created by the Distributor. Id: 8d24332b-e98e-435c-825a-e0bf8a865981
 Caps: Capabilities {
    acceptInsecureCerts: false, 
    browserName: firefox, 
    browserVersion: 105.0.3, 
    goog:chromeOptions: {w3c: true}, 
    moz:accessibilityChecks: false, 
    moz:buildID: 20221007134813, 
    moz:firefoxOptions: {args: [-headless]},          <------ Look here
    moz:geckodriverVersion: 0.31.0, 
    moz:headless: false,                                       <------ Look here
    moz:platformVersion: 10.0, 
    moz:processID: 18992, 
    moz:profile: C:\Users\BEN~1.PEI\AppData\..., 
    moz:shutdownTimeout: 60000, 
    moz:useNonSpecCompliantPointerOrigin: false, 
    moz:webdriverClick: true, 
    moz:windowless: false, 
    pageLoadStrategy: normal, 
    platformName: WINDOWS, 
    proxy: Proxy(), 
    se:cdp: ws://192.168.1.6:4444/sessi..., 
    setWindowRect: true, 
    strictFileInteractability: false, 
    timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, 
    unhandledPromptBehavior: dismiss and notify
}

We've seen a number of recent issues around selenium specifically. I'd like to set up a selenium test via github actions at some point, just to make sure the basics work. Most of the difference seems to be around the capabilities format and options.

I do think we need to properly formalize the ChromeOptions and FirefoxOptions structs to match other selenium libraries. It's a bit of work but should make all of this much nicer.

I'm not sure whether this should be added here or in fantoccini. It might be better if fantoccini supports just the WebDriver spec while thirtyfour aims at full selenium functionality on top. But at the very least fantoccini shouldn't prevent any selenium features from working.

I was reading through the Java version of the FirefoxOption class and the AbstractDriverOptions class the other day when looking into this and it seems to just be a sort of wrapper/extension of Capabilities (It literally extends a class called MutableCapabilities but it also has its own Map<String, Object> for storing options). I need to keep looking and figure out where the sort of runner class that utilizes the options lives.

I think those options are just a nice way to serialize the specific capabilities. Eventually it has to be supplied via the capabilities struct somehow. It's just the format we need to figure out.

We could split out the ChromeOptions and FirefoxOptions from the main capabilities struct if we want to. Not sure what benefit that offers other than aligning better with other selenium libraries. It might be more familiar for people who know selenium and want to switch to rust? Not sure.

Originally I patterned a lot of thirtyfour on the python selenium library, although I've since renamed the methods to align closer with fantoccini. It makes sense to get inspiration from other selenium libraries though.

I remember you telling me it was based on the python bindings. I agree, especially if implementing an Options class magically makes firefox headless work on Selenium. I can do a quick port of the python version https://github.com/SeleniumHQ/selenium/blob/84fad5e827c672ee08ba34f83cfebc24446d9693/py/selenium/webdriver/firefox/options.py and see how things go.

Regarding a github action for testing against selenium, I can look at that too. Running a standalone grid in docker on github actions shouldn't be an issue. I guess it'd just be a matter of adding a selenium_test! macro to go with local_tester! and tester!. And I bet there's a way to set ignore an integration test module by default in the Cargo.toml. Then just execute the specific module against the grid as a separate github action.

Yeah I used to have it running against selenium previously in github actions before I switched to the fantoccini tests. That would be awesome if you want to get that working. We'd want selenium 4.x at least, not sure about 3.x.

I've looked into this a bit more and I think there's no need to radically refactor the way capabilities work in thirtyfour. We're already pretty close to the effective capabilities that get passed through via other frameworks. We just don't split out the capabilities into a separate options type (which seems pointless to me and creates an extra step for the user).

The issue with firefox headless seems to be that we are passing --headless and the python lib passes -headless (single dash). Let's give that a try first. That might be all that's causing this.

Fwiw the python lib sets firefox options under the moz:firefoxOptions object just like we do. I'm looking at ways to tidy things up a little but we're already close to where we need to be.

I've just published 0.32.0-rc.2 which includes some fixes for firefox capabilities. Let me know if this fixes this issue.

So I had actually noticed the difference between --headless and -headless and tested it before to see if it would be a quick fix to the problem, but a visible browser still launches on a Selenium standalone instance. I should have made a note of that in the issue. It seems like the issue is with how selenium passes the capabilities to geckodriver. Apparently there's no way for them to simply pass a startup string so the arguments aren't being passed. There are repeated bug reports about this for multiple language bindings that have been largely ignored. The only way I have successfully gotten the tests to execute headless in firefox on a selenium grid is to set the MOZ_HEADLESS environment variable when running the tests, and wouldn't you know it so is the Selenium team https://github.com/SeleniumHQ/selenium/search?q=MOZ_HEADLESS.

That being said, I agree with the larger point about not needing to provide Options structs, especially considering the discussion here #128 (comment). I actually got to a working FirefoxOptions struct on my fork and it was... the same xD I did make some useful edits to the test runner macros so that by defaults tests would run against a standalone selenium grid and a normal webdriver, and you could choose to run against one or the other for certain tests. I will follow up with a pull request including those changes, the documentation for how to run the selenium grid when running tests, and probably a github action for running the tests as well.

setting an environment variable won't work if selenium is running docker or on another machine, so that's not really a viable option.

Given that it works when used directly with geckodriver, this seems like a selenium bug so we should be cautious about "fixing" it on our end. Happy to provide a workaround if we find one though. In the meantime you could look at running selenium grid in docker, which provides xvfb to make it effectively "headless" without using the browser's own headless mode. And you can still view the browser via vnc. I find this a much nicer solution because there are actually subtle differences in browser headless modes, which just boggles my mind. For example printing (chrome) or animations that play when switching to a tab.

As for the remaining points:

  1. It would be good to have the test runner macros also run against selenium (perhaps optionally)
  2. A github action for this would be awesome
  3. Documentation is always welcome (eventually we'll need our own website for docs that don't belong in docs.rs)

The problem persists...

When I invoke the following (taken from the documentation)...

    use thirtyfour::prelude::*;
    
    #[tokio::main]
    async fn main() -> WebDriverResult<()> {
        let caps = DesiredCapabilities::firefox();
        let driver = WebDriver::new("https://wikipedia.org", caps).await?;
        Ok(())
    }

I am receiving the following error:

Error: NewSessionError(NotW3C(String("<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML 2.0//EN\">\n<html><head>\n<title>301 Moved Permanently</title>\n</head><body>\n<h1>Moved Permanently</h1>\n<p>The document has moved <a href=\"https://www.wikipedia.org/session\">here</a>.</p>\n</body></html>\n")))

I am invoking from a Docker devcontainer Ubuntu environment with FireFox installed (Mozilla Firefox 102.11.0esr).

This is a fantoccini::error::NewSessionError::NotW3C occurring in src/session.rs.

Currently, line 565 seems suspicious:

Ok(v) | Err(error::CmdError::NotW3C(v)) => Err(error::NewSessionError::NotW3C(v)),

Investigation continues...

Per reference selenium.py.selenium.webdriver.remote.webdriver.py:189, line 563 seems wrong...

            if let Some(session_id) = v.remove("sessionId") {
                if session_id.is_string() {
                    return Ok(());
                }
                v.insert("sessionId".to_string(), session_id);
            }
            Err(error::NewSessionError::NotW3C(Json::Object(v)))

@gdennie I believe the python code is trying to still offer compatibility with older versions of selenium. Thirtyfour only supports the W3C spec, which was introduced in 2018.

As for your error, the code WebDriver::new("https://wikipedia.org", caps).await?; is not correct. This is trying to use wikipedia as a WebDriver server, which of course it is not. You need to download a webdriver like chromedriver or geckodriver, run it locally and then point WebDriver::new() at that server address. If you're running locally it would be http://localhost:9515 (for chromedriver) or http://localhost:4444 (for geckodriver)