alcounit/selenosis

Browser pods still deleted when overriding coredns config in Azure Kubernetes Service

brendon-r opened this issue · 14 comments

Since you can't edit the corefile directly, you need to override the default config using ConfigMaps. Using Kubernetes version 1.18.14 and the following ConfigMap:

apiVersion: v1
data:
  selenosis.server: |
    selenosis.svc.cluster.local:53 {
        errors
        kubernetes cluster.local {
          namespaces selenosis
        }
    }
kind: ConfigMap
metadata:
  name: coredns-custom
  namespace: kube-system

I can observe a small number of tests executing properly but at some point they all grind to a halt and the logs on selenosis are full of
image

hi @beerai
Can you share full log of browser pod for both containers (seleniferous and browser)

@beerai I don't see an issue in the logs that you provided

{"level":"info","msg":"proxy session","request":"GET /wd/hub/session/vnc-chrome-90-0-a1c80735-673b-4ceb-829e-84e63989fbfc/screenshot","request_by":"selenosis-5b47ccf859-n2tnq","request_id":"6dd18ecb-b12c-44d8-9441-5abb73e79dd1","time":"2021-04-30T06:06:00Z"}
{"level":"warning","msg":"session vnc-chrome-90-0-a1c80735-673b-4ceb-829e-84e63989fbfc delete request","request":"DELETE /wd/hub/session/vnc-chrome-90-0-a1c80735-673b-4ceb-829e-84e63989fbfc","request_by":"selenosis-5b47ccf859-ftvpg","request_id":"62d113b8-854f-4c0c-bff5-0f909fb7e8c2","time":"2021-04-30T06:06:01Z"}
{"level":"info","msg":"proxy session","request":"DELETE /wd/hub/session/vnc-chrome-90-0-a1c80735-673b-4ceb-829e-84e63989fbfc","request_by":"selenosis-5b47ccf859-ftvpg","request_id":"62d113b8-854f-4c0c-bff5-0f909fb7e8c2","time":"2021-04-30T06:06:01Z"}
{"level":"info","msg":"stopping seleniferous: session deleted","time":"2021-04-30T06:06:01Z"}
{"level":"info","msg":"deleting pod vnc-chrome-90-0-a1c80735-673b-4ceb-829e-84e63989fbfc","time":"2021-04-30T06:06:02Z"}

seleniferous received DELETE request from selenosis (selenosis-5b47ccf859-ftvpg) and termination process completed

Can you please monitor pods for unexpected termination as you described earlier and share logs?

@alcounit , It seems like a few tests execute normally but then stop or at least slow down significantly. I run 5 tests in parallel and I see this initially but after a few tests execute, new sessions stop being created. This is the log from one of the selenosis pods.

@beerai selenosis log doesn't show the reason why the browser pod has been terminated. I need seleniferous logs, please share them.

@alcounit I'm starting to suspect that the browser pod isn't being terminated abnormally, I think possibly they're not being started after a few tests run. Is there a way I could confirm if that's the case?

@beerai let's try to reproduce the issue, we need to collect logs from all applications. I use stern tool for such situations.
Run it for selenosis and seleniferous at the same time

./stern -n selenosis --selector='type=browser' > seleniferous.log
./stern -n selenosis --selector='app=selenosis' > selenosis.log

Then run your tests.

Unfortunately, the selenosis log is 2.6GB. I've compressed it and uploaded it here.

Seleniferous isn't too bad but once again isn't showing anything strange.
seleniferous.log

@beerai thanks for the logs, I see a lot of requests like this:

selenosis-5c66c4bb98-n562c selenosis time="2021-04-30T07:38:09Z" level=info msg="proxying session" request="GET /wd/hub/session/alert/accept" request_id=8b19ab05-d22a-430d-9982-66adaca52037 session_id=alert

the request itself doesn't contain selenium sessionId.
Looks like the issue with your tests.

@alcounit what's the significance of the request itself not containing a selenium sessionId? Does that mean the session has been cleaned up by selenosis but the tests are still sending requests?

@beerai I was able to reproduce the issue when I dropped sessionId during test execution

Regarding your question, before pod creation(browser + seleniferous), selenosis generates uuid identifier and concatenates it with the browser version this unique identifier then used as a hostname for pod with browser(vnc-chrome-90-0-ef9ca12b-7547-414c-9dd1-55d730cb4052)
Seleniferous known the hostname of pod, and when it proxies session request to browser it replaces original session given by browser with the hostname of pod.
When selenosis receives request and this request doesn't contain sessionId then it fails to proxy request.

@alcounit thanks, I'll close this issue and investigate further.

I suspect my tests are getting caught in a loop looking for an alert and selenosis is killing the session while the test continues to loop.

@beerai I've added sessionId validation for incoming requests, check the new version https://github.com/alcounit/selenosis-deploy

@alcounit thanks, will do. Love the resource requests too!