Updating linux resource fields may clobber itself

Question

Updating linux resource fields may clobber itself

dgibson opened this issue 4 years ago · 2 comments

Description of problem

This bug was seen by code inspection, I haven't reproduced it (but I'm pretty sure it could be with careful selection of configuration + some bad luck).

The problem arises if all of these conditions occur:

Two devices (say /dev/a and /dev/b) are configured for the container in the OCI specification
Both devices are of types which are processed by updateSpecDeviceList in the agent
/dev/a has an entry (selected by type/major/minor) in the resouces::devices section of the OCI container configuration
By coincidence /dev/a in the VM has the same major/minor numbers as /dev/b does in the host

Expected result

The permissions configured for for /dev/a will be correctly applied in the guest container for /dev/a

Actual result

The permissions configured for /dev/a will be incorrectly applied to /dev/b in the container.

More details

The problem occurs because we match resource entries by major/minor (from the host), updating them for the guest as we go.

So, when we call updateSpecDeviceList for /dev/a, we find the entry for /dev/a in the resources, and update it to the guest major/minor for /dev/a. But that happens to be the same values as the host major/minor for /dev/b, so when we call updateSpecDeviceList for /dev/b we incorrectly assume the already updated entry is for /dev/a is actually for /dev/b and update it a second time.

Answer 1 · 2020-09-10T07:41:05.000Z

Looking at the code, I agree this is problematic. The Linux structure has a single Devices list, and it's straight from the OCI spec. So there is no natural place to put a "guest device" in there.

Fortunately, Major and Minor are int64, which is way too large for a major/minor on Linux. So maybe we could use one of the high bits to indicate "guest device"? The Unix.Major and Unix.Minor filter out these bits, so we can use them somewhat freely if we are careful.

Answer 2 · 2020-09-10T07:54:22.000Z

I've already implemented a very different approach in #836