Failure to start/recover after full disk
holiman opened this issue · 3 comments
holiman commented
posting this here so I don't forget it
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 223, in _raise_for_status
response.raise_for_status()
File "/usr/local/lib/python3.6/dist-packages/requests/models.py", line 939, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: http+docker://localunixsocket/v1.35/containers/create?name=geth
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "utilities/fuzzerweb.py", line 59, in <module>
main()
File "utilities/fuzzerweb.py", line 50, in main
fuzzer.startDaemons()
File "/datadrive/evmlab/utilities/fuzzer.py", line 297, in startDaemons
procinfo = startDaemon(client_name, cmd)
File "/datadrive/evmlab/utilities/fuzzer.py", line 262, in startDaemon
cfg.logfilesPath():{ 'bind':'/logs/', 'mode':"rw"},
File "/usr/local/lib/python3.6/dist-packages/docker/models/containers.py", line 745, in run
detach=detach, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/docker/models/containers.py", line 803, in create
resp = self.client.api.create_container(**create_kwargs)
File "/usr/local/lib/python3.6/dist-packages/docker/api/container.py", line 403, in create_container
return self.create_container_from_config(config, name)
File "/usr/local/lib/python3.6/dist-packages/docker/api/container.py", line 414, in create_container_from_config
return self._result(res, True)
File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 229, in _result
self._raise_for_status(response)
File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 225, in _raise_for_status
raise create_api_error_from_http_exception(e)
File "/usr/local/lib/python3.6/dist-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
raise cls(e, response=response, explanation=explanation)
docker.errors.APIError: 500 Server Error: Internal Server Error ("mkdir /var/lib/docker/aufs/mnt/e7166fe1252e1c448d1d2fd6cc0118ff893d543b6cde1af4bc135e8f8521c6c1-init: no space left on device")
root@fuzz02:/datadrive# df
Filesystem 1K-blocks Used Available Use% Mounted on
udev 16457764 0 16457764 0% /dev
tmpfs 3293960 297268 2996692 10% /run
/dev/xvda1 8065444 8049060 0 100% /
tmpfs 16469784 0 16469784 0% /dev/shm
tmpfs 5120 0 5120 0% /run/lock
tmpfs 16469784 0 16469784 0% /sys/fs/cgroup
/dev/loop0 89088 89088 0 100% /snap/core/5145
/dev/loop1 12928 12928 0 100% /snap/amazon-ssm-agent/295
/dev/xvdh 309506048 505944 293255080 1% /datadrive
/dev/loop2 90112 90112 0 100% /snap/core/5328
/dev/loop3 13056 13056 0 100% /snap/amazon-ssm-agent/495
tmpfs 3293956 0 3293956 0% /run/user/1000
tintinweb commented
Btw. when we abort the script it might leave the docker container running (see 409 error in https://github.com/ethereum/evmlab/wiki/utils-fuzzer). I could add some code to make the script autorecover from this situation (stopping the running container) but the question is if it should try to do that by default or only if we provide a certain cmdline switch?
holiman commented
Yeah, it often leaves the docker container running. That doesn't seem to be a problem, however, because at next run it will restart it again, and it hasn't been any problems on prod
holiman commented
Fixed by cleaning up after each run