fhoeben/hsac-fitnesse-fixtures

Could not start a new session running FitNesse in container

timovd opened this issue · 14 comments

Since we're using HSAC Fixtures 5.2.35, we're not able to run FitNesse scripts in our containers anymore. Running scripts locally works fine.

__EXCEPTION__:ABORT_SLIM_TEST:nl.hsac.fitnesse.fixture.slim.StopTestException: Could not start a new session. Possible causes are invalid address of the remote server or browser start-up failure. Host info: host: '****', ip: '*.*.*.*' at nl.hsac.fitnesse.fixture.util.selenium.driverfactory.DriverManager.getSeleniumHelper(DriverManager.java:52)

This is still reproducible with the latest version (5.2.37).
I tried using Selenium-Java 4.8.1 instead of 4.8.3 and the error still occurs.
Are there specific changes in the ChromeDriver (=>112) or Chrome browser that requires changes on our side?

I have not run into the issue myself, but I saw that the latest release CI run on GitLab ran with chrome 108.0.5359.124

Are you using the images that I create? I guess not because I just saw that the push to docker hub failed (without failing the build :-( )

Can you share an image, or Dockerfile so that I can reproduce the problem?

I believe I no longer use a remote docker in my containers, but a local chrome as that gives access to the Chrome developer tools. Are you using a Selenium Hub, or a local chrome also and is the error just misleading?

It turns out I read the logs incorrectly. The images were pushed after all. So I believe there are images in docker hub (using chrome 108) with my latest fixtures that work:

I just checked with my separate repo creating docker images. It seems to launch and connect to chrome using driver 113 just fine. So it could be something in your own container (or indeed the Selenium grid, if you use it, do you have logging from that?)

(I used the branch that is arm64/Mac native)

2023-05-10 17:25:31.239 test Started 'SampleTests.SlimTests.BrowserTest.SuiteSetUp' (1 / 4)
Starting ChromeDriver 113.0.5672.63 (0e1a4471d5ae5bf128b1bd8f4d627c8cbd55f70c-refs/branch-heads/5672@{#912}) on port 12517
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
ChromeDriver was started successfully.
[1683739531.508][SEVERE]: bind() failed: Cannot assign requested address (99)
May 10, 2023 5:25:32 PM org.openqa.selenium.devtools.CdpVersionFinder findNearestMatch
WARNING: Unable to find an exact match for CDP version 113, so returning the closest version found: 110
2023-05-10 17:25:32.189 test Finished 'SampleTests.SlimTests.BrowserTest.SuiteSetUp'

2023-05-10 17:25:32.190 test Started 'SampleTests.SlimTests.BrowserTest.AriaLabelledTest' (2 / 4)
2023-05-10 17:25:33.165 test Finished 'SampleTests.SlimTests.BrowserTest.AriaLabelledTest'

2023-05-10 17:25:33.165 test Started 'SampleTests.SlimTests.BrowserTest.BypassHeuristicTest' (3 / 4)
2023-05-10 17:25:33.763 test Finished 'SampleTests.SlimTests.BrowserTest.BypassHeuristicTest'

2023-05-10 17:25:33.766 test Started 'SampleTests.SlimTests.BrowserTest.SuiteTearDown' (4 / 4)
2023-05-10 17:25:33.780 test Finished 'SampleTests.SlimTests.BrowserTest.SuiteTearDown'

2023-05-10 17:25:33.969 testSuite Finished 'nl.hsac.fitnesse.HsacFitNesseSuiteStarter'

i have the same issue..
i have had many upgrades.
i have a build line, and only always upgraded the fitnesse,hsac and selenium deps, and always worked.
but now locally works but in kubernetes it not working the driver give the message above.

(ps with chromium driver 113 and latest alpine edge

working the option --remote-allow-origins=* Seem to be in the java_opts when fitnesse is started.

@robvanderboom that should not be necessary when using Selenium 4.9.0. What version of HSAC, Java and Selenium are you using?

mm the issue is different.
i see that this option is needed for the version that was already working (latest 2022) + latest alpine and chrome version (113)..
So the optins seems to be needed only because i update chrome alpine..

But with this latest chrome/alpine version the issue still exists. tried many thinks but don't know where it is comming from.
The issue exists in docker envirionment only..
with all java versions and the following dependency versions:
fitnesse: 20230503 (first 20221219)
hsac: 5.2.39. (first (5.2.29)
selenium: 4.9.1. (first 4.7.2)

So with the versions mentions as (first:) its working with the older alpine/chromium driver and also with the latest alpine chromium (113) version.

But only by changing these 3 dependencies to the newer version, the error occures.. Strange thing is that the error stack trace ends at line getSeleniumHelper as described above and seems not starting the driver in the pod.
(i checked with ps aux , it does never even try to start a driver process.).

@robvanderboom Can you share your Dockerfile so I can see whether I can reproduce the problem?

Hi, curently experiencing kinda same issue after upgrading to 5.2.36 (or 5.2.39 also).

Method:
|connect to driver at | <*> |with json capabilities|!-{browserName: 'chrome',chrome_binary:'<*>','goog:chromeOptions': {args: [ "--headless=new"],prefs: {"download.default_directory": "<*>","profile.default_content_settings.popups": 0,"profile.password_manager_enabled": "false"} }}-!|

Doesn't even make a call to remote grid (can be seen in hub and node logs).
Prints error:
Caused by: org.openqa.selenium.SessionNotCreatedException: Could not start a new session. Possible causes are invalid address of the remote server or browser start-up failure. Host info: host: '<*>', ip: '<*>' Build info: version: '4.9.0', revision: 'd7057100a6' System info: os.name: 'Windows 10', os.arch: 'amd64', os.version: '10.0', java.version: '1.8.0_231' Driver info: org.openqa.selenium.remote.RemoteWebDriver Command: [null, newSession {capabilities=[Capabilities {browserName: chrome, chrome_binary: <*>, goog:chromeOptions: {args: [--headless=new], prefs: {download.default_directory:<*>, profile.default_content_settings.popups: 0.0, profile.password_manager_enabled: false}}, unhandledPromptBehavior: ignore}]}] Capabilities {browserName: chrome, chrome_binary: /usr/bin/google-chrome, goog:chromeOptions: {args: [--headless=new], prefs: {download.default_directory: <*>, profile.default_content_settings.popups: 0.0, profile.password_manager_enabled: false}}, unhandledPromptBehavior: ignore}

Might come from "Command: [null"...

The only thing required to reproduce issue - upgrade to 5.2.36 or 5.2.39.
P.s. it works with .35 and .34 versions.

Other settings were not changed during debuging:
selenium server 4.10 (reproduces when running grid v 3.141.59 also, so issue should be on hsac fixtures)
Chromedriver: 114

@Taxanas did you try with Java =>11? Currently you're using "java.version: '1.8.0_231' "

Hi, curently experiencing kinda same issue after upgrading to 5.2.36 (or 5.2.39 also).

Method: |connect to driver at | <*> |with json capabilities|!-{browserName: 'chrome',chrome_binary:'<*>','goog:chromeOptions': {args: [ "--headless=new"],prefs: {"download.default_directory": "<*>","profile.default_content_settings.popups": 0,"profile.password_manager_enabled": "false"} }}-!|

Doesn't even make a call to remote grid (can be seen in hub and node logs). Prints error: Caused by: org.openqa.selenium.SessionNotCreatedException: Could not start a new session. Possible causes are invalid address of the remote server or browser start-up failure. Host info: host: '<*>', ip: '<*>' Build info: version: '4.9.0', revision: 'd7057100a6' System info: os.name: 'Windows 10', os.arch: 'amd64', os.version: '10.0', java.version: '1.8.0_231' Driver info: org.openqa.selenium.remote.RemoteWebDriver Command: [null, newSession {capabilities=[Capabilities {browserName: chrome, chrome_binary: <*>, goog:chromeOptions: {args: [--headless=new], prefs: {download.default_directory:<*>, profile.default_content_settings.popups: 0.0, profile.password_manager_enabled: false}}, unhandledPromptBehavior: ignore}]}] Capabilities {browserName: chrome, chrome_binary: /usr/bin/google-chrome, goog:chromeOptions: {args: [--headless=new], prefs: {download.default_directory: <*>, profile.default_content_settings.popups: 0.0, profile.password_manager_enabled: false}}, unhandledPromptBehavior: ignore}

Might come from "Command: [null"...

The only thing required to reproduce issue - upgrade to 5.2.36 or 5.2.39. P.s. it works with .35 and .34 versions.

Other settings were not changed during debuging: selenium server 4.10 (reproduces when running grid v 3.141.59 also, so issue should be on hsac fixtures) Chromedriver: 114

@fhoeben, the latest stable configuration before the mentioned error occurs is:
Fitnesse 20230503
Hsac.fixtures.version 5.2.35
Selenium HUB 4.10.0
Selenium Node-Edge 112.0 (assuming issue is identical with the same Chrome driver version)

Raising the Node-Edge version to 113.0 or above will result in the same error mentioned , as will upping the HSAC fixtures version to anything above .35.
I read about some fiddling/workaround concerning parameter ENV FITNESSE_OPTS --remote-allow-origins=* ? Not sure if that could be related (to allow/restore connection to the grid from hsac-fitnesse?) or how/where to set these values (Dockerfile or compose.yml?).
foutmelding na update HSAC fixtures

@jayhome I'm unsure of your exact setup (which machines/containers you have running), but your configuration seems strange to me although I have to admit I haven't run with an actual Selenium grid in a while.
In the configuration in the wiki (in the screenshot) I believe I see you are requesting Edge version 111.0 and in your description you indicate you are using Edge 112 or 113.

I would actually not expect browser startup to succeed when version 111 is requested and only 112 or up are available. Does the behaviour change when you commit the version property from the capabilities requested, or change the version value to the value offered by the Selenium node?

This issue originally was about docker images running chrome locally no longer functioning, at least I thought it was. Your problem, and the one from @Taxanas, seem different since you are running a Selenium grid and are unable to connect to it.

@fhoeben, thank you for looking into what seems a selenium hub connection failure since specific browser and hsac-fixture version. Because conditions (running Fitnesse in docker) and errors are the same for me: (EXCEPTION:ABORT_SLIM_TEST:nl.hsac.fitnesse.fixture.slim.StopTestException: Could not start a new session.) and ("Command: [null"...), as described by @timovd and @Taxanas, I thought sharing my config details and findings might help identify the root cause.
The previous screenshot is old from when I was still on Edge 111 with version number matching available Edge node. Edge 111 will work fine with hsac 5.2.35. Edge-versions 112 and below are all okay with .35. The incompatibility started with either going above Edge 112 OR raise the hsac-fixtures version to anything above 5.2.35.

I ran a check to rule out providing version number as part of 'requested capabilities' affects the behavior/outcome. To find out it DOES affect the behavior. After removing the webdriver version tag from the JSON capabilities, the problem no longer occurs. New configuration: Edge 115 with hsac-fixtures 5.2.40. 💯 (well not version 100, but you get the point).
run OK