Device gets stucked after few seconds
Closed this issue · 2 comments
Hi, we've followed all steps in setup example, except adding
[X-Fleet]
MachineMetadata="role=appside" or devside or nginx
to correspond to selected IP addresses in /srv/nginx/nginx.conf, because if it needs to be specified statically how can it be distributed to random nodes in cluster?
Everything seems to be running OK, user can login etc. but after a while the remote screen of device disappears, adb gets blank and after leaving the device it can't be unassigned (the stop using button does nothing). And it stays that way for all devices until the whole system is restarted.
The host OS is Ubuntu 15.10. All services seems to be running:
UNIT MACHINE ACTIVE SUB
adbd.service 4e693e0e.../172.17.8.103 active running
adbd.service db8ebd68.../172.17.8.101 active running
adbd.service ff94d61c.../172.17.8.102 active running
nginx.service db8ebd68.../172.17.8.101 active running
rethinkdb-proxy-28015.service 4e693e0e.../172.17.8.103 active running
rethinkdb-proxy-28015.service db8ebd68.../172.17.8.101 active running
rethinkdb-proxy-28015.service ff94d61c.../172.17.8.102 active running
stf-app@3100.service ff94d61c.../172.17.8.102 active running
stf-auth@3200.service 4e693e0e.../172.17.8.103 active running
stf-log-rethinkdb.service 4e693e0e.../172.17.8.103 active running
stf-processor@1.service 4e693e0e.../172.17.8.103 active running
stf-processor@2.service ff94d61c.../172.17.8.102 active running
stf-processor@3.service db8ebd68.../172.17.8.101 active running
stf-provider@1.service 4e693e0e.../172.17.8.103 active running
stf-provider@2.service ff94d61c.../172.17.8.102 active running
stf-provider@3.service db8ebd68.../172.17.8.101 active running
stf-reaper.service ff94d61c.../172.17.8.102 active running
stf-storage-plugin-apk@3300.service db8ebd68.../172.17.8.101 active running
stf-storage-plugin-image@3400.service 4e693e0e.../172.17.8.103 active running
stf-storage-temp@3500.service ff94d61c.../172.17.8.102 active running
stf-triproxy-app.service ff94d61c.../172.17.8.102 active running
stf-triproxy-dev.service 4e693e0e.../172.17.8.103 active running
stf-websocket@3600.service db8ebd68.../172.17.8.101 active running
We're using exactly the same HW as recommended in https://github.com/openstf/stf. We've also tried to get some info from fleet journal of services, but without any luck. Is there any service which could cause the described behaviour?
Thanks a lot.
Without proper STF logs, it is kind of difficult to figure out the problem. These may be the possible reasons.
-
Virtual Box USB filtering.
This example uses Virtual Box USB filters, so that VM can have access to android devices. Since, this is just a sample setup(not for production), I never checked how reliable are those filters and how long will they work.If this is the problem, you can debug this by running below command to the coreos VM where usb filters are added.
vagrant ssh core-01 docker run --rm -ti --net container:adbd sorccu/adb adb devices
If the above command returns
adb devices
, then USB filters are okay. -
Socker connection timeout problem
STF uses zmq for communication between different components. Sometime, depending on the host OS and default TCP_KEEPALIVE value, there may be timeout. I never had this problem with OS X as host but since your host is Ubuntu, this may be the reason. You can refer this openstf/stf#100 for more information.
Lastly, I would like to say, this tutorial was only meant to give users an idea of STF setup. In production, you should not be using Virtual machines as USB connections with VM are not so reliable. If you already have linux machine, then I will suggest to use Systemd and do setup same as described in Deployment doc.
In case, you still want to debug this problem. I will say chrome devTools will be your best friend. Check what is happening with the websocket connections when device screen is gray. Also, if you want to see STF logs, you can do this by following command
docker logs CONTAINER_ID
Closing it as there is no response. Feel free to open it again in case problem still exist.