rfjakob/gocryptfs

Race condition when reading symlinks can expose unrelated parts of the file system in reverse mode

slackner opened this issue · 17 comments

Assume user A has access to a machine, and root has mounted an encrypted version of the user home directory with -allow_other. Further assume that the user knows the master key. By quickly exchanging directories with symlinks in the original data, it is possible trick GoCryptFS into exposing unwanted parts of the file system.

Steps to reproduce:

  1. Create a regular reverse mount point
  2. Create a directory "test" in the original directory
  3. Access the corresponding encrypted directory in the mount point (ls <encrypted dir>)
  4. Quickly delete the directory in the original data, and instead create a symlink pointing somewhere else, for example to /root
  5. Access the encrypted directory again, it will now blindly follow the symlink and reveal the content of the target directory

I used a small script to reproduce this issue, manual execution is probably too slow. Even without -allow_other this could be a security issue and could, for example, backup unwanted data instead of just the symbolic link.

I have not prepared a patch for this yet, but apparently there is a O_NOFOLLOW constant which could be used. This requires to handle symlinks in the cipherDir manually, before calling open().

I wonder how apache or nginx handle this

I believe we can always use O_NOFOLLOW. The kernel resolves symlinks before we geht the path. In other words, we should never get a "benign" open on a symlink.

I believe we can always use O_NOFOLLOW.

I'm not sure if it is currently allowed, but please keep in mind that the cipherDir could also contain symlinks. We probably want to resolve them immediately after program startup if this is not done yet.

Not sure what you mean. Symlinks look like random crap in reverse mode, and because the kernel resolves resolves them, you can never follow them:

$ gocryptfs -reverse a b

$ tree
.
├── a
│   ├── bar -> foo
│   └── foo
└── b
    ├── gocryptfs.conf
    ├── gocryptfs.diriv
    ├── HUcUNp8dgFr_3JyES4rEVg
    │   └── gocryptfs.diriv
    └── NUq2Y99YTs1eebDg8H3jrA -> hv8qARpFAnWa5AzBuy9sIOvlU2ZVapmbZ5iVxdvX7HZW3G8


$ cd b/NUq2Y99YTs1eebDg8H3jrA 
bash: cd: b/NUq2Y99YTs1eebDg8H3jrA: No such file or directory
$ ls b/NUq2Y99YTs1eebDg8H3jrA/
ls: cannot access 'b/NUq2Y99YTs1eebDg8H3jrA/': No such file or directory

FUSE log:

$ gocryptfs -reverse -fusedebug -fg -nosyslog a b
Dispatch 18: LOOKUP, NodeId: 1. names: [NUq2Y99YTs1eebDg8H3jrA] 23 bytes
Serialize 18: LOOKUP code: OK value: {NodeId: 3 Generation=0 EntryValid=1.000 AttrValid=1.000 Attr={M0120777 SZ=47 L=1 1026:1026 B0*4096 i0:55224 A 1511423765.300966390 M 1511423209.330202823 C 1511423209.330202823}}
Dispatch 19: READLINK, NodeId: 3. 
Serialize 19: READLINK code: OK value:  "hv8qARpFAnWa5AzBuy9sIOvlU2ZVapmbZ5iVxdvX7HZW3G8"
Dispatch 20: LOOKUP, NodeId: 1. names: [hv8qARpFAnWa5AzBuy9sIOvlU2ZVapmbZ5iVxdvX7HZW3G8] 48 bytes
Serialize 20: LOOKUP code: OK value: {NodeId: 0 Generation=0 EntryValid=1.000 AttrValid=0.000 Attr={M00 SZ=0 L=0 0:0 B0*0 i0:0 A 0.000000000 M 0.000000000 C 0.000000000}}

What the kernel does is (indexed by transaction number as above):

  1. LOOKUP: Does NUq2Y99YTs1eebDg8H3jrA exist? -> Yes it does, and it is a symlink (somewhere in the mode bits M0120777)
  2. READLINK: Read the link target -> hv8qARpFAnWa5AzBuy9sIOvlU2ZVapmbZ5iVxdvX7HZW3G8
  3. LOOKUP: Does hv8qARpFAnWa5AzBuy9sIOvlU2ZVapmbZ5iVxdvX7HZW3G8 exist? -> No (NodeId: 0)

-> Return ENOENT to userspace

Sorry if it wasn't clear what I mean. To be a bit more precise, I am concerned about the following code in fusefrontend_reverse/rpath.go:

filepath.Join(rfs.args.Cipherdir, relPath)

The relPath component should never contain a symlink (except if there is a race-condition, and its fine to fail then). The problem is if Cipherdir itself contains a symlink. If this is the case, then open() will always fail after adding the O_NOFOLLOW flag. This could be the case if users pass a path starting with /var/run, which is just a symlink on most distros. The only cleanup which is already done is in main.go:

args.cipherdir, _ = filepath.Abs(flagSet.Arg(0))

However, when looking at the actual implementation (unixAbs), there are no attempts to resolve any symlinks:

https://golang.org/src/path/filepath/path.go

I hope it is a bit more clear now. Some other remarks:

  • This might affect multiple places, including the directory enumeration and the longname handling.

  • O_NOFOLLOW might not be supported everywhere. We probably need a feature check or should add a fallback when the error is EINVAL

O_NOFOLLOW only affects the last component of the path:

O_NOFOLLOW
[...]
Symbolic links in earlier components of the pathname will still be followed.

Okay, that means that my concern should not be an issue. However, it also means that O_NOFOLLOW is not sufficient to fix the bug, except we manually split the path and open all components separately - or do I miss anything?

https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg04347.html discusses a similar problem:


The core problem is that the "local" backend relies on path-based syscalls
to access the underlying filesystem. All path-based syscalls are vulnerable
to this issue, even open(O_NOFOLLOW) or syscalls that explicitly don't
dereference symlinks, since the kernel only checks the rightmost element of
the path. Depending on the privilege level of the QEMU process, a guest can
end up opening, renaming, changing ACLs, unlinking... files on the host
filesystem.

The right way to address this is to use "at" variants of all syscalls in
the "local" backend code. This requires to open directories without
traversing any symlink in the intermediate path elements. There was a
tentative to introduce an O_BENEATH flag for openat() that would address
this:

https://patchwork.kernel.org/patch/7007181/

Unfortunately this never got merged. I shall contact the author and try
to revive this kernel patchset.

In the meantime, an alternative is to walk through all path elements
manually with openat(O_NOFOLLOW). This is likely to degrade performances,
but I don't see any better way to get the vulnerability fixed in 2.9.
I'll try to come up with some numbers later.

Yes, O_NOFOLLOW is not sufficient. I will check what apache and nginx are doing - they face the same problem.

I've just checked the Nginx source:

https://github.com/nginx/nginx/blob/f8a9d528df92c7634088e575e5c3d63a1d4ab8ea/src/core/ngx_open_file_cache.c#L693

It looks like they also split the path, and open each component separately with NGX_FILE_NOFOLLOW = O_NOFOLLOW

That should have been all of them. Can you check if your script now fails to trigger a race?

Thanks for working on this! The original issue is definitely fixed. Not sure if it is a problem, but findLongnameParent still seems to use an absolute path for enumerating the directory content. Is that something we also have to fix or is it harmless anyway?

BTW: It looks like there are some opportunities to use the newly introduced openBackingDir helper, for example, in the Readlink function. Do you plan to change that or do you want me to send some patches?

About openBackingDir - yes the plan was to get rid of the duplicated code later, patches welcome!

I'll take a look about at longname handling, pretty sure there's room for improvement.

Besides findLongnameParent, there is also a Lstat call in virtualfile.go:GetAttr. Not really critical, but also not very hard to fix... ;)