insitro/redun

Docker executor: No such container

Closed this issue · 3 comments

To reproduce: I cloned the latest state of the redun repository and installed it via pip install -e.
Then I executed the following:

cd examples/docker
cp ../05_aws_batch/data.tsv .
cd docker
make setup
make build
cd ..

I also added a docker executor to the .redun/redun.ini file in the docker example folder:

# redun configuration.

[backend]
db_uri = sqlite:///redun.db

[executors.default]
type = local
max_workers = 20

[executors.docker]
type = docker
image = redun_example
scratch = scratch

Upon running redun run workflow.py main I encountered the following error:

[redun] Executor[docker]: submit redun job b4be464d-aa88-499b-9946-4404a3481bc8 as Docker container 89c03051ce785094202c556bfc9f619e2c005eaf9f16a8355bdbff684e955cc4:
[redun]   container_id = 89c03051ce785094202c556bfc9f619e2c005eaf9f16a8355bdbff684e955cc4
[redun]   scratch_path = /Users/ricomeinl/Desktop/retro/redun/examples/docker/.redun/scratch/jobs/9b358c2bab1d7db8c92811b1c7ef53fac23209fe
[redun] 
[redun] *** Workflow error
[redun] 
[redun] | JOB STATUS 2022/05/28 15:56:51
[redun] | TASK                                         PENDING RUNNING  FAILED  CACHED    DONE   TOTAL
[redun] | 
[redun] | ALL                                                1       5       0       0       0       6
[redun] | redun.examples.docker.count_colors_by_script       0       1       0       0       0       1
[redun] | redun.examples.docker.main                         0       1       0       0       0       1
[redun] | redun.examples.docker.task_on_docker               0       1       0       0       0       1
[redun] | redun.postprocess_script                           1       0       0       0       0       1
[redun] | redun.script                                       0       1       0       0       0       1
[redun] | redun.script_task                                  0       1       0       0       0       1
[redun] 
[redun] Execution duration: 2.11 seconds
Error: No such container: 5e83c5b505f4fbd4a53850ff44b556b409337675d4a165a88ad89614568e8125
[redun] *** Execution failed. Traceback (most recent task last):
[redun]   File "/Users/ricomeinl/Desktop/retro/redun/redun/executors/docker.py", line 347, in _monitor
[redun]     for job in jobs:
[redun]   File "/Users/ricomeinl/Desktop/retro/redun/redun/executors/docker.py", line 237, in iter_job_status
[redun]     logs = subprocess.check_output(["docker", "logs", job_id]).decode("utf8")
[redun]   File "/Users/ricomeinl/.pyenv/versions/3.8.13/lib/python3.8/subprocess.py", line 415, in check_output
[redun]     return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
[redun]   File "/Users/ricomeinl/.pyenv/versions/3.8.13/lib/python3.8/subprocess.py", line 516, in run
[redun]     raise CalledProcessError(retcode, process.args,
[redun] CalledProcessError: Command '['docker', 'logs', '5e83c5b505f4fbd4a53850ff44b556b409337675d4a165a88ad89614568e8125']' returned non-zero exit status 1.

Thanks for reporting. I think there is a small regression in how local docker containers were cleaned up. I have a porposed fix in #36.

@ricomnl When you get a chance, can you confirm if the latest main branch solves this issue? Thanks again for reporting.

Yes, that did indeed solve it! Sorry for the delay on my end. Thanks for the prompt response on this!!