Race condition when reading symlinks can expose unrelated parts of the file system in reverse mode
slackner opened this issue · 17 comments
Assume user A has access to a machine, and root has mounted an encrypted version of the user home directory with -allow_other
. Further assume that the user knows the master key. By quickly exchanging directories with symlinks in the original data, it is possible trick GoCryptFS into exposing unwanted parts of the file system.
Steps to reproduce:
- Create a regular reverse mount point
- Create a directory "test" in the original directory
- Access the corresponding encrypted directory in the mount point (
ls <encrypted dir>
) - Quickly delete the directory in the original data, and instead create a symlink pointing somewhere else, for example to
/root
- Access the encrypted directory again, it will now blindly follow the symlink and reveal the content of the target directory
I used a small script to reproduce this issue, manual execution is probably too slow. Even without -allow_other
this could be a security issue and could, for example, backup unwanted data instead of just the symbolic link.
I have not prepared a patch for this yet, but apparently there is a O_NOFOLLOW
constant which could be used. This requires to handle symlinks in the cipherDir manually, before calling open()
.
I wonder how apache or nginx handle this
I believe we can always use O_NOFOLLOW. The kernel resolves symlinks before we geht the path. In other words, we should never get a "benign" open on a symlink.
I believe we can always use O_NOFOLLOW.
I'm not sure if it is currently allowed, but please keep in mind that the cipherDir could also contain symlinks. We probably want to resolve them immediately after program startup if this is not done yet.
Not sure what you mean. Symlinks look like random crap in reverse mode, and because the kernel resolves resolves them, you can never follow them:
$ gocryptfs -reverse a b
$ tree
.
├── a
│ ├── bar -> foo
│ └── foo
└── b
├── gocryptfs.conf
├── gocryptfs.diriv
├── HUcUNp8dgFr_3JyES4rEVg
│ └── gocryptfs.diriv
└── NUq2Y99YTs1eebDg8H3jrA -> hv8qARpFAnWa5AzBuy9sIOvlU2ZVapmbZ5iVxdvX7HZW3G8
$ cd b/NUq2Y99YTs1eebDg8H3jrA
bash: cd: b/NUq2Y99YTs1eebDg8H3jrA: No such file or directory
$ ls b/NUq2Y99YTs1eebDg8H3jrA/
ls: cannot access 'b/NUq2Y99YTs1eebDg8H3jrA/': No such file or directory
FUSE log:
$ gocryptfs -reverse -fusedebug -fg -nosyslog a b
Dispatch 18: LOOKUP, NodeId: 1. names: [NUq2Y99YTs1eebDg8H3jrA] 23 bytes
Serialize 18: LOOKUP code: OK value: {NodeId: 3 Generation=0 EntryValid=1.000 AttrValid=1.000 Attr={M0120777 SZ=47 L=1 1026:1026 B0*4096 i0:55224 A 1511423765.300966390 M 1511423209.330202823 C 1511423209.330202823}}
Dispatch 19: READLINK, NodeId: 3.
Serialize 19: READLINK code: OK value: "hv8qARpFAnWa5AzBuy9sIOvlU2ZVapmbZ5iVxdvX7HZW3G8"
Dispatch 20: LOOKUP, NodeId: 1. names: [hv8qARpFAnWa5AzBuy9sIOvlU2ZVapmbZ5iVxdvX7HZW3G8] 48 bytes
Serialize 20: LOOKUP code: OK value: {NodeId: 0 Generation=0 EntryValid=1.000 AttrValid=0.000 Attr={M00 SZ=0 L=0 0:0 B0*0 i0:0 A 0.000000000 M 0.000000000 C 0.000000000}}
What the kernel does is (indexed by transaction number as above):
- LOOKUP: Does
NUq2Y99YTs1eebDg8H3jrA
exist? -> Yes it does, and it is a symlink (somewhere in the mode bitsM0120777
) - READLINK: Read the link target ->
hv8qARpFAnWa5AzBuy9sIOvlU2ZVapmbZ5iVxdvX7HZW3G8
- LOOKUP: Does
hv8qARpFAnWa5AzBuy9sIOvlU2ZVapmbZ5iVxdvX7HZW3G8
exist? -> No (NodeId: 0)
-> Return ENOENT to userspace
Sorry if it wasn't clear what I mean. To be a bit more precise, I am concerned about the following code in fusefrontend_reverse/rpath.go:
filepath.Join(rfs.args.Cipherdir, relPath)
The relPath
component should never contain a symlink (except if there is a race-condition, and its fine to fail then). The problem is if Cipherdir itself contains a symlink. If this is the case, then open()
will always fail after adding the O_NOFOLLOW
flag. This could be the case if users pass a path starting with /var/run
, which is just a symlink on most distros. The only cleanup which is already done is in main.go:
args.cipherdir, _ = filepath.Abs(flagSet.Arg(0))
However, when looking at the actual implementation (unixAbs
), there are no attempts to resolve any symlinks:
https://golang.org/src/path/filepath/path.go
I hope it is a bit more clear now. Some other remarks:
-
This might affect multiple places, including the directory enumeration and the longname handling.
-
O_NOFOLLOW
might not be supported everywhere. We probably need a feature check or should add a fallback when the error isEINVAL
O_NOFOLLOW only affects the last component of the path:
O_NOFOLLOW
[...]
Symbolic links in earlier components of the pathname will still be followed.
Okay, that means that my concern should not be an issue. However, it also means that O_NOFOLLOW
is not sufficient to fix the bug, except we manually split the path and open all components separately - or do I miss anything?
https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg04347.html discusses a similar problem:
The core problem is that the "local" backend relies on path-based syscalls
to access the underlying filesystem. All path-based syscalls are vulnerable
to this issue, even open(O_NOFOLLOW) or syscalls that explicitly don't
dereference symlinks, since the kernel only checks the rightmost element of
the path. Depending on the privilege level of the QEMU process, a guest can
end up opening, renaming, changing ACLs, unlinking... files on the host
filesystem.
The right way to address this is to use "at" variants of all syscalls in
the "local" backend code. This requires to open directories without
traversing any symlink in the intermediate path elements. There was a
tentative to introduce an O_BENEATH flag for openat() that would address
this:
https://patchwork.kernel.org/patch/7007181/
Unfortunately this never got merged. I shall contact the author and try
to revive this kernel patchset.
In the meantime, an alternative is to walk through all path elements
manually with openat(O_NOFOLLOW). This is likely to degrade performances,
but I don't see any better way to get the vulnerability fixed in 2.9.
I'll try to come up with some numbers later.
Yes, O_NOFOLLOW is not sufficient. I will check what apache and nginx are doing - they face the same problem.
I've just checked the Nginx source:
It looks like they also split the path, and open each component separately with NGX_FILE_NOFOLLOW = O_NOFOLLOW
That should have been all of them. Can you check if your script now fails to trigger a race?
Thanks for working on this! The original issue is definitely fixed. Not sure if it is a problem, but findLongnameParent
still seems to use an absolute path for enumerating the directory content. Is that something we also have to fix or is it harmless anyway?
BTW: It looks like there are some opportunities to use the newly introduced openBackingDir
helper, for example, in the Readlink
function. Do you plan to change that or do you want me to send some patches?
About openBackingDir - yes the plan was to get rid of the duplicated code later, patches welcome!
I'll take a look about at longname handling, pretty sure there's room for improvement.
Besides findLongnameParent
, there is also a Lstat
call in virtualfile.go:GetAttr
. Not really critical, but also not very hard to fix... ;)