openstf/setup-examples

Device gets stucked after few seconds

Closed this issue · 2 comments

Hi, we've followed all steps in setup example, except adding

[X-Fleet]
MachineMetadata="role=appside" or devside or nginx

to correspond to selected IP addresses in /srv/nginx/nginx.conf, because if it needs to be specified statically how can it be distributed to random nodes in cluster?

Everything seems to be running OK, user can login etc. but after a while the remote screen of device disappears, adb gets blank and after leaving the device it can't be unassigned (the stop using button does nothing). And it stays that way for all devices until the whole system is restarted.

The host OS is Ubuntu 15.10. All services seems to be running:

UNIT                    MACHINE             ACTIVE  SUB
adbd.service                4e693e0e.../172.17.8.103    active  running
adbd.service                db8ebd68.../172.17.8.101    active  running
adbd.service                ff94d61c.../172.17.8.102    active  running
nginx.service               db8ebd68.../172.17.8.101    active  running
rethinkdb-proxy-28015.service       4e693e0e.../172.17.8.103    active  running
rethinkdb-proxy-28015.service       db8ebd68.../172.17.8.101    active  running
rethinkdb-proxy-28015.service       ff94d61c.../172.17.8.102    active  running
stf-app@3100.service            ff94d61c.../172.17.8.102    active  running
stf-auth@3200.service           4e693e0e.../172.17.8.103    active  running
stf-log-rethinkdb.service       4e693e0e.../172.17.8.103    active  running
stf-processor@1.service         4e693e0e.../172.17.8.103    active  running
stf-processor@2.service         ff94d61c.../172.17.8.102    active  running
stf-processor@3.service         db8ebd68.../172.17.8.101    active  running
stf-provider@1.service          4e693e0e.../172.17.8.103    active  running
stf-provider@2.service          ff94d61c.../172.17.8.102    active  running
stf-provider@3.service          db8ebd68.../172.17.8.101    active  running
stf-reaper.service          ff94d61c.../172.17.8.102    active  running
stf-storage-plugin-apk@3300.service db8ebd68.../172.17.8.101    active  running
stf-storage-plugin-image@3400.service   4e693e0e.../172.17.8.103    active  running
stf-storage-temp@3500.service       ff94d61c.../172.17.8.102    active  running
stf-triproxy-app.service        ff94d61c.../172.17.8.102    active  running
stf-triproxy-dev.service        4e693e0e.../172.17.8.103    active  running
stf-websocket@3600.service      db8ebd68.../172.17.8.101    active  running

We're using exactly the same HW as recommended in https://github.com/openstf/stf. We've also tried to get some info from fleet journal of services, but without any luck. Is there any service which could cause the described behaviour?

Thanks a lot.

Without proper STF logs, it is kind of difficult to figure out the problem. These may be the possible reasons.

  1. Virtual Box USB filtering.
    This example uses Virtual Box USB filters, so that VM can have access to android devices. Since, this is just a sample setup(not for production), I never checked how reliable are those filters and how long will they work.

    If this is the problem, you can debug this by running below command to the coreos VM where usb filters are added.

    vagrant ssh core-01
    docker run --rm -ti --net container:adbd sorccu/adb adb devices

    If the above command returns adb devices, then USB filters are okay.

  2. Socker connection timeout problem
    STF uses zmq for communication between different components. Sometime, depending on the host OS and default TCP_KEEPALIVE value, there may be timeout. I never had this problem with OS X as host but since your host is Ubuntu, this may be the reason. You can refer this openstf/stf#100 for more information.

Lastly, I would like to say, this tutorial was only meant to give users an idea of STF setup. In production, you should not be using Virtual machines as USB connections with VM are not so reliable. If you already have linux machine, then I will suggest to use Systemd and do setup same as described in Deployment doc.

In case, you still want to debug this problem. I will say chrome devTools will be your best friend. Check what is happening with the websocket connections when device screen is gray. Also, if you want to see STF logs, you can do this by following command

docker logs CONTAINER_ID

Closing it as there is no response. Feel free to open it again in case problem still exist.