fgrehm/vagrant-lxc

Issue #405 (lxc-stop failing) seems to be back

Closed this issue · 3 comments

Looks like the old issue with lxc-stop failing after a gracefull shutdown is back:

==> box: Forcing shutdown of container...
 INFO driver: Shutting down container...
 INFO subprocess: Starting process: ["/usr/bin/sudo", "/usr/bin/env", "lxc-attach", "--name", "box", "--", "/bin/true"]
 INFO subprocess: Vagrant not running in installer, restoring original environment...
DEBUG subprocess: Selecting on IO
DEBUG subprocess: Waiting for process to exit. Remaining to timeout: 32000
DEBUG subprocess: Exit status: 0
 INFO subprocess: Starting process: ["/usr/bin/sudo", "/usr/bin/env", "lxc-attach", "--name", "box", "--", "/sbin/halt"]
 INFO subprocess: Vagrant not running in installer, restoring original environment...
DEBUG subprocess: Selecting on IO
DEBUG subprocess: Waiting for process to exit. Remaining to timeout: 31999
DEBUG subprocess: Exit status: 0
 INFO subprocess: Starting process: ["/usr/bin/sudo", "/usr/bin/env", "lxc-stop", "--name", "box"]
 INFO subprocess: Vagrant not running in installer, restoring original environment...
DEBUG subprocess: Selecting on IO
DEBUG subprocess: stderr: box is not running
DEBUG subprocess: Waiting for process to exit. Remaining to timeout: 31999
DEBUG subprocess: Exit status: 1
ERROR warden: Error occurred: There was an error executing ["sudo", "/usr/bin/env", "lxc-stop", "--name", "box"]

Containers are hosted on Ubuntu 16.04.1 LTS, lxc version is 2.05. vargrant-lxc is latest master (1.2.1) The root cause seems to be that lxc-stop is exiting with an exit code of 1 instead of 2 as expected by the code (and specified in the man entry for lxc-stop). In truth, this is really a bug in lxc-stop, since it seems to be returning the wrong error code.

It might be wiser to to just use lxc-wait to check the status of the container after the graceful shutdown, and then only call lxc-stop if necessary. Or ignore the return of lxc-stop and only go into an error state only if lxc-wait shows that the container is not stopped.

You may just want to wait until the lxc team fixes lxc-stop. However that means that vagrant-lxc will be broken for lxc 2.05. My current work-around in my own fork will be to just ignore the output of lxc-stop fot the moment

Justa follow up -- I submitted an issue to the lxc team, and they've fixed it in master. So we should be good on the next release

Same issue here and for me, a simple workaround was running vagrant destroy twice. First time triggers the bug, but second time progresses.

1st run gave me the box not running error (as per original post)

2nd run

...
==> eg: Destroying VM and associated drives...
 INFO subprocess: Starting process: ["/usr/bin/sudo", "/usr/local/bin/vagrant-lxc-wrapper", "lxc-destroy", "--name", "eg_eg_1481362958533_47439"]
 INFO subprocess: Vagrant not running in installer, restoring original environment...
DEBUG subprocess: Selecting on IO
DEBUG subprocess: stdout: Destroyed container eg_eg_1481362958533_47439
...

I confirmed it was indeed stopped and destroyed with lxc-ls

So until upstream fixes and new release lands in 16.04 LTS, double destroy...

Hey, sorry for the silence here but this project is looking for maintainers 😅

As per #499, I've added the ignored label and will close this issue. Thanks for the interest in the project and LMK if you want to step up and take ownership of this project on that other issue 👋