CAP_AUDIT_READ runc error
Closed this issue ยท 16 comments
When using concourse which uses cf guardian which uses runc (holy cow), I get this cryptic error:
runc start: exit status 1: unknown capability "CAP_AUDIT_READ"
I'm running on the standard AWS EC2 Ubuntu 14.04 LTS HVM AMI (what a word-full).
It seems as though this kernel does not have CAP_AUDIT_READ
. It DOES have the following:
ubuntu@ip-172-31-18-217:~$ man capabilities | grep CAP_AUDIT -a2
CAP_AUDIT_CONTROL (since Linux 2.6.11)
Enable and disable kernel auditing; change auditing filter rules; retrieve auditing
status and filtering rules.
CAP_AUDIT_WRITE (since Linux 2.6.11)
Write records to kernel auditing log.
This effectively makes concourse on AWS dead in the water for me, since guardian only supports Ubuntu 14.04 LTS and this bug/whatever prevents me from running it on the Ubuntu 14.04 LTS AMI.
Hi there!
We use Pivotal Tracker to provide visibility into what our team is working on. A story for this issue has been automatically created.
The current status is as follows:
- #140298567 CAP_AUDIT_READ runc error
- #140295681 CAP_AUDIT_READ runc error
- #140237601 CAP_AUDIT_READ runc error
- #127659303 CAP_AUDIT_READ runc error
This comment, as well as the labels on the issue, will be automatically updated as the status in Tracker changes.
I got the same issue on Ubuntu 14.04.4 LTS.
{"timestamp":"1466976739.388674736","source":"guardian","message":"guardian.destroy.state.started","log_level":1,"data":{"handle":"fe51309e-4f56-4f00-71c1-841d200ec360","session":"27.1"}}
{"timestamp":"1466976739.392828226","source":"guardian","message":"guardian.destroy.state.finished","log_level":1,"data":{"handle":"fe51309e-4f56-4f00-71c1-841d200ec360","session":"27.1"}}
{"timestamp":"1466976739.392859221","source":"guardian","message":"guardian.destroy.state-failed-skipping-kill","log_level":1,"data":{"error":"runc state: runc start: exit status 1: open /run/runc/fe51309e-4f56-4f00-71c1-841d200ec360/state.json: no such file or directory","handle":"fe51309e-4f56-4f00-71c1-841d200ec360","session":"27"}}
{"timestamp":"1466976739.392911911","source":"guardian","message":"guardian.destroy.destroy.started","log_level":1,"data":{"handle":"fe51309e-4f56-4f00-71c1-841d200ec360","session":"27.2"}}
{"timestamp":"1466976739.393044949","source":"guardian","message":"guardian.destroy.destroy.finished","log_level":1,"data":{"handle":"fe51309e-4f56-4f00-71c1-841d200ec360","session":"27.2"}}
{"timestamp":"1466976739.393073320","source":"guardian","message":"guardian.destroy.finished","log_level":1,"data":{"handle":"fe51309e-4f56-4f00-71c1-841d200ec360","session":"27"}}
{"timestamp":"1466976739.435929537","source":"guardian","message":"guardian.create.create-failed-cleaningup.cleanedup","log_level":1,"data":{"cause":"runc start: exit status 1: unknown capability \"CAP_AUDIT_READ\"","handle":"fe51309e-4f56-4f00-71c1-841d200ec360","session":"26.3"}}
{"timestamp":"1466976739.435967207","source":"guardian","message":"guardian.api.garden-server.create.failed","log_level":2,"data":{"error":"runc start: exit status 1: unknown capability \"CAP_AUDIT_READ\"","request":{"Handle":"","GraceTime":0,"RootFSPath":"raw:///opt/concourse/worker/volumes/live/fc8b2323-05e6-40f0-5948-4c1cde301412/volume","BindMounts":null,"Network":"","Privileged":true,"Limits":{"bandwidth_limits":{},"cpu_limits":{},"disk_limits":{},"memory_limits":{}}},"session":"2.1.9"}}
When I'm just starting worker from scratch by manual. :\
Good discussion here:
opencontainers/runc#924
It's just clean installation of Ubuntu 14.04.4 LTS with Concourse running in docker compose:
Up!
It seems you've kernel version is lower than 3.19. You can check it by:
$ uname -r
3.13.0-88-generic
The example above shows you have a kernel version is 3.13 while Concourse (and guardian) requires a 3.19+ kernel. It means you have to update it for you.
For Ubuntu 14.04.4 LTS:
sudo apt-get install --install-recommends linux-generic-lts-wily
Then I reboot the server and now everything is working. You can read full discussion here. Have a good update! :)
Let's close it?
This breaks compatibility with RHEL6, RHEL7, and it's it's derivatives - Oracle Linux, Scientific Linux, CentOS, because they use older kernels that don't have this CAP. RHEL6 is 2.6.32 based, RHEL7 is 3.10 based.
Yes, it would be great to add OS list which already has compatible kernels to the documentation and recommend for everyone who have older versions just to update kernel.
Hi @jadekler & @DenisIzmaylov - glad you have a workaround for now! We do have a story to support this linked in the second comment so I'll keep this issue open but just to avoid getting anyone's hopes up, concourse relies on another guardian feature that requires 3.19+ (specifically the StreamIn feature which requires the execveat system call) so at least until we find a way to not need that, 3.19 will stay a hard requirement.
Note, RHEL6, RHEL7 and similar will not get an updated kernel in the course of their lifetime, so just running "yum/apt-get update" isn't an option. I just tried Debian 8 too, which is also the current stable and it's got a 3.16 kernel.
Ack that 3.19 is a requirement, but it seems Guardian (and hence Concourse etc) will only run on Ubuntu out of the major Linux Distros.
Could a workaround for StreamIn/execveat be found? If the call was converted to a execve
+ cd, could the requirement for 3.19 come down?
Hi @jamesread - ack that it'd be nice to support a broader range of OSes that are on older kernels.. if someone PRed a version of StreamIn that didn't require execveat
and was secure we'd totally look at that, however my recollection is we need the execveat
(as opposed to execve
) to be secure against symlink traversals in the container potentially leading to container breakout, so I'd hesitate to accept a change unless we were confident of the security implications
For what its worth, I've been running on stock 16.04 with only intermittent issues - depending on how frequently you turn over workers in a spot fleet or similar, it might be just as good as dealing with curating a psuedo-stemcell for 14.04.
closing - we have no current plans to support older kernels, though as above we're happy to accept PRs that add support
@jamesread I've had luck with Oracle Linux 7 running Docker then running the Concourse-CI Containers provided. I have yet to try in a distributed setup rather than a single server, and I was able to get those working on my local machine (MacOS 10.12.6)
Maybe running the binary on its own isn't the right answer in this case.