Substra/substra-backend

Docker container name conflict

Closed this issue · 2 comments

While executing tuples that output pretty large models (1GB), I ran into the following issue:

ERROR 2020-02-05 15:30:50,114 substrapp.tasks.tasks 15 140710911403840 [00-01-0004-969dae2]
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 261, in _raise_for_status
    response.raise_for_status()
  File "/usr/local/lib/python3.6/dist-packages/requests/models.py", line 940, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 409 Client Error: Conflict for url: http+docker://localhost/v1.35/containers/create?name=compositeTraintuple_12031be8_train

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/celery/app/trace.py", line 385, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/celery/app/trace.py", line 650, in __protected_call__
    return self.run(*args, **kwargs)
  File "/usr/src/app/substrapp/tasks/tasks.py", line 530, in compute_task
    max_retries=int(getattr(settings, 'CELERY_TASK_MAX_RETRIES')))
  File "/usr/local/lib/python3.6/dist-packages/celery/app/task.py", line 704, in retry
    raise_with_context(exc)
  File "/usr/src/app/substrapp/tasks/tasks.py", line 524, in compute_task
    res = do_task(subtuple, tuple_type)
  File "/usr/src/app/substrapp/tasks/tasks.py", line 590, in do_task
    org_name
  File "/usr/src/app/substrapp/tasks/tasks.py", line 743, in _do_task
    environment=environment
  File "/usr/src/app/substrapp/tasks/utils.py", line 252, in compute_docker
    client.containers.run(**task_args)
  File "/usr/local/lib/python3.6/dist-packages/docker/models/containers.py", line 803, in run
    detach=detach, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/docker/models/containers.py", line 861, in create
    resp = self.client.api.create_container(**create_kwargs)
  File "/usr/local/lib/python3.6/dist-packages/docker/api/container.py", line 430, in create_container
    return self.create_container_from_config(config, name)
  File "/usr/local/lib/python3.6/dist-packages/docker/api/container.py", line 441, in create_container_from_config
    return self._result(res, True)
  File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 267, in _result
    self._raise_for_status(response)
  File "/usr/local/lib/python3.6/dist-packages/docker/api/client.py", line 263, in _raise_for_status
    raise create_api_error_from_http_exception(e)
  File "/usr/local/lib/python3.6/dist-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation)
docker.errors.APIError: 409 Client Error: Conflict ("Conflict. The container name "/compositeTraintuple_12031be8_train" is already in use by container "3d5cb300703d3dcec4321ee83612d7c0cee2a83743faf070dfa54e969461f8e2". You have to remove (or rename) that container to be a
ble to reuse that name.")

@jmorel Did you manage to reproduce it =) ?

No, never. And like #144 it happened on a machine which disks were full (in one case because of the models, in the other because of the ledger).