os: safer file open functions
neild opened this issue ยท 119 comments
Please see the updated proposal in #67002 (comment)
Directory traversal vulnerabilities are a common class of vulnerability, in which an attacker tricks a program into opening a file that it did not intend. These attacks often take the form of providing a relative pathname such as "../../../etc/passwd", which results in access outside an intended location. CVE-2024-3400 is a recent, real-world example of directory traversal leading to an actively exploited remote code execution vulnerability.
A related, but less commonly exploited, class of vulnerability involves unintended symlink traversal, in which an attacker creates a symbolic link in the filesystem and manipulates the target into following it.
I propose adding several new functions to the os package to aid in safely opening files with untrusted filename components and defending against symlink traversal.
It is very common for programs to open a file in a known location using an untrusted filename. Programs can avoid directory traversal attacks by first validating the filename with a function like filepath.IsLocal. Defending against symlink traversal is harder.
I propose adding functions to open a file in a location:
package os
// OpenFileIn opens the named file in the named directory.
//
// If the file contains relative path components (..), no component may
// refer to a location outside the parent directory. The file may not be
// "", an absolute path, or (on Windows) a reserved device name such as "NUL".
// The file may refer to the directory itself (.).
//
// If any component of the named file references a symbolic link
// referencing a location out of the parent directory,
// OpenFileIn returns an error.
//
// OpenFileIn otherwise behaves like OpenFile.
func OpenFileIn(parent, name string, flag int, perm FileMode) (*File, error)
// CreateIn creates or truncates the named file in the named parent directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Create].
func CreateIn(parent, name string) (*File, error)
// Open opens the named file in the named parent directory for reading.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Open].
func OpenIn(parent, name string) (*File, error)The OpenFileIn, OpenIn, and CreateIn family of functions safely open a file within a given location, defending against directory traversal, symlinks to unexpected locations, and unexpected access to Windows device files.
All modern Unix systems that I know of provide an openat call, to open a file relative to an existing directory handle (FD). Windows provides an equivalent (NtCreateFile with ObjectAttributes including a RootDirectory). Of the supported Go ports, I believe only js and plan9 do not support openat or an equivalent.
I propose adding support for openat-like behavior to os.File:
package os
// OpenFile opens the named file in the directory associated with the file f.
//
// If the file contains relative path components (..), no component may
// refer to a location outside the parent directory. The file may not be
// "", an absolute path, or (on Windows) a reserved device name such as "NUL".
//
// If any component of the named file references a symbolic link
// referencing a location out of the parent directory,
// OpenFile returns an error.
func (f *File) OpenFile(name string, flag int, perm FileMode) (*File, error)
// Create creates or truncates the named file in
// the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) Create(name string) (*File, error)
// Open opens the named file in the directory associated with the file f for reading.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) OpenIn(name string) (*File, error)Like the top-level CreateIn, OpenIn, and OpenFileIn, the methods defend against accessing files outside the given directory. This is unlike the default behavior of openat, which permits absolute paths, relative paths outside the root, and symlink traversal outside the root. (It corresponds to Linux's openat2 with the RESOLVE_BENEATH flag.)
A property of openat is that it follows a file across renames: If you open a directory, rename the directory, and use openat on the still-open FD, access is relative to the directory's new location. We cannot support this behavior on platforms which don't have openat or an equivalent (plan9 and js). We could fall back to operating purely on filenames, such that f.OpenIn(x) is equivalent to os.OpenIn(f.Name(), x). However, this seems potentially hazardous. I propose, therefore, that File.CreateIn, File.OpenIn, and File.OpenFileIn return an errors.ErrUnsupported error on these platforms.
The above functions defend against symlink traversal that leads outside of the designated root directory. Some users may wish to defend against symlink traversal entirely. Many modern operating systems provide an easy way to disable symlink following: Linux has RESOLVE_NO_SYMLINKS, Darwin has O_NOFOLLOW_ANY, and some other platforms have equivalents.
I propose adding support for disabling symlink traversal to the os package:
const (
// O_NOFOLLOW_ANY, when included in the flags passed to [OpenFile], [OpenFileIn],
// or [File.OpenFile], disallows resolution of symbolic links anywhere in the
// named file.
//
// O_NOFOLLOW_ANY affects the handling of symbolic links in all components
// of the filename. (In contrast, the O_NOFOLLOW flag supported by many
// platforms only affects resolution of the last path component.)
//
// O_NOFOLLOW_ANY does not disallow symbolic links in the parent directory name
// parameter of [OpenFileIn].
//
// O_NOFOLLOW_ANY does not affect traversal of hard links, Windows junctions,
// or Plan 9 bind mounts.
//
// On platforms which support symbolic links but do not provide a way to
// disable symbolic link traversal (GOOS=js), open functions return an error
// if O_NOFOLLOW_ANY is provided.
O_NOFOLLOW_ANY int = (some value)
)O_NOFOLLOW_ANY may be passed to OpenFile, OpenFIleIn, or File.OpenFIle to disable symlink traversal in any component of the file name. For OpenFileIn, symlinks would still be permitted in the directory component.
On platforms which do not support the equivalent of O_NOFOLLOW_ANY/RESOLVE_NO_SYMLINKS natively, the os package will use successive openat calls with O_NOFOLLOW to emulate it. On platforms with no openat (plan9 and js), open operations will return an error when O_NOFOLLOW_ANY is specified.
@gabyhelp's overview of this issue: #67002 (comment)
is this essentially https://pkg.go.dev/github.com/google/safeopen with Beneath -> In ?
that also has a ReadFile / WriteFile variant which I'd use more then the create version.
The design of this proposal is influenced by github.com/google/safeopen, but differs in a few areas. (Sorry, I really should have mentioned safeopen as prior art.)
Of the three parts of this proposal:
os.OpenInis essentiallysafeopen.OpenBeneath.File.Openis a slightly more limited but safer version ofopenat, and has no equivalent insafeopen.O_NOFOLLOW_ANYhas no equivalent insafeopen.
ReadFileIn and WriteFileIn seem like a useful and logical extension of this proposal.
Yes, please. When I was working on safe file operations and it turned out to be hard to do correctly without OS support.
Without O_NOFOLLOW, you have to slowly check every segment for symlinks before traversing into it. For the naive implementation, how do you protect against TOCTOU bugs? At the moment that you check some path segment and verify that it's not a symlink (or a safe one) and then proceed to descend into it, some other process (or goroutine) could have asynchronously changed the target.
What, if any, changes would be made to "io/fs"? Ideally, there is a mirror of these APIs in that package.
If we wanted to extend this proposal to io.FS, I believe the one addition would be:
package fs
// An OpenFile is a directory file whose entries may be opened with the Open method.
type OpenFile interface {
File
// Open opens the named file in the directory.
//
// When Open returns an error, it should be of type *PathError
// with the Op field set to "openat", the Path field set to name,
// and the Err field describing the problem.
//
// Open should reject attempts to open names that do not
// satisfy ValidPath(name), returning a *PathError with Err set to
// ErrInvalid or ErrNotExist.
Open(name string) (File, error)
}A more interesting question is os.DirFS. Currently, DirFS has two documented limitations: It follows symlinks out of the directory tree, and if the FS root is a relative path then it will be affected by later Chdir calls.
I don't think we can change DirFS's symlink-following behavior: It's documented, and it's a behavior that a user could reasonably depend on.
The interaction between DirFS and Chdir seems less likely to be something a user would depend on, but it is documented. I'm not sure if we can change it at this point, but perhaps.
Perhaps we should add a version of DirFS that opens the directory root at creation time (retaining a handle to it even if the current working directory changes or the root is renamed), and refuses to follow symlinks out of the root. I'm not sure if that should be part of this proposal or a separate one.
Perhaps the new names should include an At suffix to make clear to casual readers of the API that these are not the usual open system calls. Either way, the three new methods should probably reference some shared section of documentation on the concept of the at-suffixed operations.
I presume you mean the new method names? The functions have an "In" suffix.
We could also include the In suffix on the methods; I waffled on whether it belongs there or not:
func (f *File) OpenFileIn(name string, flag int, perm FileMode) (*File, error)
func (f *File) CreateIn(name string) (*File, error)
func (f *File) OpenIn(name string) (*File, error)
I'm trying to avoid the suffix "At" to make it clear that none of these calls are precisely openat. openat(2) permits escaping from the root directory via absolute or relative paths, and doesn't do anything about symlink traversal. (Linux has openat2(2), which is quite configurable. The proposed *In functions are essentially openat2 with the RESOLVE_BENEATH flag.)
Fair enough. Should the fs.OpenFile.Open method also be named OpenIn?
Should the fs.OpenFile.Open method also be named OpenIn?
Probably, for consistency.
Are we missing RemoveIn?
We should probably have RemoveIn as well:
// RemoveIn removes the named file or (empty) directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Remove].
func RemoveIn(parent, name string) error
// Remove removes the named file or (empty) directory
// in the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) RemoveIn(name string) errorPerhaps also RemoveAllIn?
This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
โ rsc for the proposal review group
Maybe it would be better if the parent was an fs.FS? Seems more widely applicable, if somewhat more complex.
func RemoveIn(parent, name string) error
Note that Windows does not provide (AFAIK) an unlinkat counterpart. It will have to be emulated doing something like:
func RemoveIn(parent, name string) error
f, err := os.OpenIn(parent, name)
if err != nil {
return err
}
return syscall.SetFileInformationByHandle(f.Fd(), syscall.FileDispositionInfo, ...)
}As a user, when should I use os.Open vs os.OpenIn? Should I continue to default to os.Open, and only use OpenIn when I am actively avoiding a security issue, or should my default be OpenIn now?
Maybe it would be better if the parent was an fs.FS?
The os package file functions operate on the local filesystem. fs.FS is an abstraction over a filesystem; it sits atop the os package functions, not under them.
If we want to add support for OpenIn on fs.FS filesystems, we would want something like #67002 (comment). We could add that to this proposal if we want, but for now I'm keeping this proposal focused on the os package.
Note that Windows does not provide (AFAIK) an unlinkat counterpart.
I think that's fine. This proposal requires varying degrees of implementation depending on platform already. (Linux has the very nice openat2 with RESOLVE_BENEATH, platforms without an equivalent are going to require us to do more work to produce equivalent behavior.)
If it's not possible to emulate unlinkat on Windows, that might be a problem, but it sounds like it should be possible.
As a user, when should I use os.Open vs os.OpenIn?
You should use OpenIn when you want to open a file within a directory.
I don't know how to give comprehensive guidance on when to use one vs. the other; the two functions behave differently and you should use the one that suits your specific purposes. If you're writing a command-line tool that accepts an input filename from the user, you probably want to use os.Open. If you're writing a tool that decompresses an archive, you probably want to use os.OpenIn to ensure that the output doesn't escape from the destination directory.
The FS OpenFile looks good, yes. Somehow I skipped that comment, sorry.
A property of openat is that it follows a file across renames: If you open a directory, rename the directory, and use openat on the still-open FD, access is relative to the directory's new location. We cannot support this behavior on platforms which don't have openat or an equivalent (plan9 and js). We could fall back to operating purely on filenames, such that f.OpenIn(x) is equivalent to os.OpenIn(f.Name(), x). However, this seems potentially hazardous. I propose, therefore, that File.CreateIn, File.OpenIn, and File.OpenFileIn return an errors.ErrUnsupported error on these platforms.
I don't really understand this. What is a program supposed to do if File.Open returns ErrUnsupported? Either it can give up and report an error to the user, meaning that the program simply doesn't work on plan9 or js, Or it can implement the fallback manually,
f, err := parent.Open(filename)
if err == os.ErrUnsupported {
f, err = os.OpenIn(parent.Name(), filename)
//or even: os.Open(path.Join(parent.Name(), filename))
}
which is exactly the "hazardous" behaviour you say you're trying to avoid. If the underlying platform truly has no equivalent to openat, though, then there's no other reasonable fallback. Returning ErrUnsupported is just creating more work for developers for no tangible benefit.
I think this could be addressed perfectly well in the docs by saying that some platforms (linux, windows, etc) provide extra guarantees around renamed files, and that others (plan9 and js) do not.
If the underlying platform truly has no equivalent to openat, though, then there's no other reasonable fallback.
The question is whether this is a reasonable fallback or not.
In the case of os.OpenIn, I think it's reasonable to fall back to a less-secure implementation. Lacking openat, we can statically validate the untrusted filename component for unintended traversal (os.OpenIn(dir, "../escapes")), and we can test for symlinks on the path, but we remain vulnerable to TOCTOU attacks. TOCTOU symlink attacks, where an attacker creates a symlink on the path while we're in the process of validating it, are an edge case and I think it's okay for us to support os.OpenIn on platforms where we can't defend against them (plan9 and js).
In the case of os.File.OpenIn, however, there are valid operations that we simply can't support without openat or the equivalent. With openat, you can open a directory, rename or even delete it, and then continue to access files in that directory. There's no way to simulate this with operations on the directory's filename.
Perhaps it's okay to say that os.File.OpenIn behaves differently on plan9 and js, and that users who need the ability to follow a directory across renames/deletes are responsible for not trying to do so on those platforms. Returning an error is the more conservative choice.
I note also that if you don't need the openat behavior of following a directory across renames, you don't need to use os.File.OpenIn at all--you can just always use os.OpenIn.
I don't know how to give comprehensive guidance on when to use one vs. the other; the two functions behave differently and you should use the one that suits your specific purposes. If you're writing a command-line tool that accepts an input filename from the user, you probably want to use os.Open. If you're writing a tool that decompresses an archive, you probably want to use os.OpenIn to ensure that the output doesn't escape from the destination directory.
I would recommend adding guidance in the documentation of the not *In variants calling out that the *In variants exist and recommended for cases when directory escape is not desirable.
Updated proposal, with comments on various changes arising from above discussion and working on implementation.
The OpenFileIn, CreateIn, and OpenIn functions are unchanged from the original proposal:
package os
// OpenFileIn opens the named file in the named directory.
//
// If the file contains relative path components (..), no component may
// refer to a location outside the parent directory. The file may not be
// "", an absolute path, or (on Windows) a reserved device name such as "NUL".
// The file may refer to the directory itself (.).
//
// If any component of the named file references a symbolic link
// referencing a location out of the parent directory,
// OpenFileIn returns an error.
//
// OpenFileIn otherwise behaves like OpenFile.
func OpenFileIn(parent, name string, flag int, perm FileMode) (*File, error)
// CreateIn creates or truncates the named file in the named parent directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Create].
func CreateIn(parent, name string) (*File, error)
// Open opens the named file in the named parent directory for reading.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Open].
func OpenIn(parent, name string) (*File, error)The File methods now all have an In suffix: File.OpenFileIn, File.CreateIn, File.OpenIn. This is clearer overall: For example, f.CreateIn creates a file in the directory f, it doesn't create f. This also resolves an ambiguity between File.Stat and File.StatIn (see below).
package os
// OpenFileIn opens the named file in the directory associated with the file f.
//
// If the file contains relative path components (..), no component may
// refer to a location outside the parent directory. The file may not be
// "", an absolute path, or (on Windows) a reserved device name such as "NUL".
//
// If any component of the named file references a symbolic link
// referencing a location out of the parent directory,
// OpenFileIn returns an error.
func (f *File) OpenFileIn(name string, flag int, perm FileMode) (*File, error)
// CreateIn creates or truncates the named file in
// the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) CreateIn(name string) (*File, error)
// OpenIn opens the named file in the directory associated with the file f for reading.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) OpenIn(name string) (*File, error)To the above, we add MkdirIn, RemoveIn, and StatIn functions and methods. Creating directories and removing files are fundamental operations, and there's no reason to leave them out. DirFSIn (see below) provides a traversal-resistant Stat, so StatIn is included here as well.
Open question: Should we add LstatIn as well? How about SymlinkIn? RenameIn? ReadFileIn and WriteFileIn? On one hand, I don't want to let this proposal get out of hand with an endless array of new functions; on the other hand, some of these do seem useful. I'd appreciate proposal committee's thoughts on where we should draw the line with this proposal.
package os
// MkdirIn creates a new directory in the named parent directory
// with the specified name and permission bits (before umask).
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Mkdir].
func MkdirIn(parent, name string, perm FileMode) error
// MkdirIn creates a new directory in the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) MkdirIn(name string, perm FileMode) error
// RemoveIn removes the named file or (empty) directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Remove].
func RemoveIn(parent, name string) error
// RemoveIn removes the named file or (empty) directory
// in the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) RemoveIn(name string) error
// StatIn returns a FileInfo describing the named file in the named parent directory.
// It applies the same constraints on files as [OpenFileIn].
// It otherwise behaves like [Stat].
func StatIn(parent, name string) (FileInfo, error)
// StatIn returns a FileInfo describing the named file in the directory associated with the file f.
// It applies the same constraints on files as [File.OpenFile].
func (f *File) StatIn(name string) (FileInfo, error)We add os.DirFSIn, a traversal-safe version of os.DirFS.
Open question: DirFSIn or DirInFS? I prefer DirFSIn--"a directory filesystem in (root)", but internal discussion suggested DirInFS might be better.
For the moment, we do not add any new optional interfaces to io/fs, such as fs.OpenFile (see #67002 (comment)).
There are many existing APIs, both in and out of the standard library, that operate on an io/fs.FS. Providing a traversal-resistant FS implementation is a simpler and more effective approach to hardening programs than requiring every API which operates on an FS to check for and use an OpenFile method.
Open question: It seems likely to me that we're going to want more variations on DirFS in the future. For example, it seems reasonable to want an FS that disallows symlink traversal entirely (essentially passing O_NOFOLLOW_ANY to every file open). Therefore, I think DirFSIn should either accept an options struct to allow for future customization, or should return a concrete type with customization methods. ( For example, fs := os.DirFSIn("root"); fs.SetFollowSymlinks(false)). The following returns a concrete type.
package os
// DirFSIn returns a filesystem for the tree of files rooted at the directory dir.
// The directory dir must not be "".
//
// Open calls will resolve symbolic links, but return an error if any link points outside the directory dir.
//
// The returned filesystem implements [io/fs.FS], [io/fs.StatFS], [io/fs.ReadFileFS], and [io/fs.ReadDirFS].
func DirFSIn(dir string) *FS
type FS struct{}
func (fs *FS) Open(name string) (File, error)
func (fs *FS) Stat(name string) (FileInfo, error)
func (fs *FS) ReadFile(name string) ([]byte, error)
func (fs *FS) ReadDir(name string) ([]fs.DirEntry, error)The O_NOFOLLOW_ANY open flag remains unchanged.
Open question: Should we add os.O_NOFOLLOW? I only realized while implementing this proposal that it doesn't exist already. (Existing code which uses the flag uses syscall.O_NOFOLLOW.) On one hand, if we're supporting a portable O_NOFOLLOW_ANY, perhaps we should support a portable O_NOFOLLOW as well. On the other hand, O_NOFOLLOW can be dangerously surprising, since it only prevents symlink resolution in the final filename component, so perhaps we should stick to the more robust form.
const (
// O_NOFOLLOW_ANY, when included in the flags passed to [OpenFile], [OpenFileIn],
// or [File.OpenFile], disallows resolution of symbolic links anywhere in the
// named file.
//
// O_NOFOLLOW_ANY affects the handling of symbolic links in all components
// of the filename. (In contrast, the O_NOFOLLOW flag supported by many
// platforms only affects resolution of the last path component.)
//
// O_NOFOLLOW_ANY does not disallow symbolic links in the parent directory name
// parameter of [OpenFileIn].
//
// O_NOFOLLOW_ANY does not affect traversal of hard links, Windows junctions,
// or Plan 9 bind mounts.
//
// On platforms which support symbolic links but do not provide a way to
// disable symbolic link traversal (GOOS=js), open functions return an error
// if O_NOFOLLOW_ANY is provided.
O_NOFOLLOW_ANY int = (some value)
)Open question: How should we handle .. relative path components in filenames?
Consider the following directory tree:
a/bis a directory.sis a symlink toa/b.f,a/f, anda/b/fare files.
On the Unix command line, if we cat s/../f, we print the contents of the file a/f.
If we open the current directory and openat(curfd, "s/../f"), we also open a/f.
The safeopen package cleans filenames prior to opening a file, so safeopen.OpenBeneath(".", "s/../f") opens the file f. The safeopen package also forbids symlink traversal entirely, so safeopen.OpenBeneath(".", "s/f") returns an error rather than opening a/b/f.
On Windows, things are confusing (and I'm still trying to understand what's going on under the hood): Using NtCreateFile to open a file in "." (the rough equivalent of Unix's openat):
s/../fopensa/f.a/b/../fis an error.
It appears that NtCreateFile will resolve .. path components only if a symlink appears somewhere in the path. This is weird enough that I feel like I must be be missing something.
The question is: What should os.OpenIn(".", "s/../f") do in this case? Options I see include:
- Resolve the symlink
sand the relative path component.., and opena/f. This matches Unixopenatbehavior. - Clean the path prior to opening, performing lexical resolution of
..components, and openf. This matches thesafeopenpackage's behavior. I don't like this option, as it defines a new set of nonstandard filesystem semantics (pathnames are lexically resolved prior to opening). - Disallow relative path components and return an error.
- Disallow symlink resolution and return an error when attempting to open
s. - Disallow both relative path components and symlink resolution.
My current inclination is the first option above: Permit both symlinks and .. path components, and resolve each step of the path in sequence. (So s/../f opens a/f in the above example.) This may be a bit tricky to implement on Windows, but it should be possible.
I can, however, see a good argument for disallowing . and .. relative path components. This simplifies the implementation, there are few if any real-world cases where resolving paths like s/../f is necessary, and users can lexically clean paths with filepath.Clean if desired.
Open question: How should we handle platforms without openat or an equivalent, namely GOOS=plan9 and GOOS=js?
GOOS=js does not permit implementing OpenIn in a fashion free of TOCTOU races (swapping a directory component with a symlink elsewhere on the filesystem). I believe Plan 9 doesn't have symlinks; if that's the case, TOCTOU races are not a concern on it.
GOOS=js and GOOS=plan9 do not permit implementing File.OpenIn correctly. Opening a directory as f, renaming or deleting that directory, and then using f.OpenIn should act on the original directory. Without openat or an equivalent, we have no way to follow the directory handle and the best we can do is act on the original directory path.
I've argued above for supporting OpenIn on these platforms and not supporting File.OpenIn. I think that I've been convinced by arguments above that it's better to support as much of the API as possible, even if platform limitations prevent supporting all of it. I therefore propose that on js and plan9, f.OpenIn("path") behaves equivalently to os.OpenIn(f.Name(), "path").
This is rather extensive API. Perhaps a separate package from os would be better? Maybe os/in?
On a very minor note, Plan 9 can be considered to implement O_NOFOLLOW_ANY because there are no symlinks on Plan 9 at all.
More generally, I understand the motivation here, but the amount of new API is a bit daunting. I think we need to keep thinking about reducing the total amount of API. It seems like there needs to be some type representing the constrained file system. For this message, let's call it a Dir. It would be defined like:
// A Dir represents a root directory in the file system.
// Methods on a Dir can only access files and directories inside that root directory.
// Methods on Dir are safe to be used from multiple goroutines simultaneously.
// After Close is called, methods on Dir return errors.
type Dir struct {
...
}
func OpenDir(name string) (*Dir, error)
func (*Dir) FS() fs.FS
func (*Dir) OpenFile
func (*Dir) Create
func (*Dir) Open
func (*Dir) OpenDir
func (*Dir) Mkdir
func (*Dir) Remove
func (*Dir) MkdirAll
func (*Dir) RemoveAll
func (*Dir) Close
All the top-level convenience things like os.OpenIn can be left out. Code can use OpenDir followed by the operation it wants.
That at least feels like a more manageable amount of API.
I have been thinking for a while and have not come up with a name like more than Dir. It's certainly not perfect, and OpenDir would need a doc comment explaining that it's not opendir(3), but it's not bad.
@rsc With this approach it would be also nice to define some globals, like CWD.
Zig also has a simmilar abstraction in the std, also called Dir https://ziglang.org/documentation/master/std/#std.fs.Dir.
Also
// Methods on Dir are safe to be used from multiple goroutines simultaneously.Is this true for Close?
@mateusz834 Sure, Close can be called from multiple goroutines simultaneously. Same thing is true of os.File.
Re CWD, I disagree with having a global whose meaning changes each time the current directory changes. What is supposed to happen if I do this?
dir := os.CWD
os.Chdir("/tmp")
dir.Open("foo")
Neither answer is good.
If code wants the current current directory [sic], then it can use os.OpenDir("."). Then at least it is clearer what happens with Chdir.
I vote for the name Root rather than Dir.
For the record, other possible methods:
ChmodChownChtimesLchownLstatReadlinkRenameSymlinkTruncate
Those are the top-level os package functions that take file names that can't otherwise be handled with complete safety. (There is also os.Stat and os.Readfile but those can be done with File methods.)
I'm fine with dropping the top-level functions, although I note that "open this file within this directory, where the directory name is trusted and the file name is not" is a very common operation. Perhaps we could limit the top-level functions to CreateIn and OpenIn, for the most common operations?
What's the motivation for a new Dir type as opposed to adding additional methods to File? The number of methods is the same either way, and we've already got methods on File which only apply to directories (ReadDir, Readdir, Readdirnames). Plus io/fs.ReadDirFile defines a directory as a specialization of a file.
If we did want a new Dir (or Root) type, would we want a way to convert a File into a Dir? (f.Dir(), maybe?) And would we want a Dir.ReadDir method for listing the contents of a directory?
The motivation is to put all the "safety" stuff on one type instead of intermixing it with "non-safety" stuff. The names end up cleaner and the code using the APIs should be easier to vet. I don't think we will need a way to turn a File into a Dir; start with the Dir and d.Open(".") if you really need a File. They should be separate concepts.
Put another way, although a Dir is an fd underneath and a File is an fd underneath, that doesn't mean the two are conceptually the same. The Dir type is a safety abstraction, and the fact that it is very like a File in its representation is an implementation detail.
I am sure that other helpers will be wanted on Dir too, including ReadDir, ReadFile, WriteFile. All the more reason for keeping Dir separate from File.
@ianlancetaylor I thought about Root as well but Root has the problem that you'd expect os.Root to be /. Or maybe uid 0. It's already overloaded. And RootedDir seems too long. It's interesting that Zig ended up on a type of the same name, although clicking around on https://ziglang.org/documentation/master/std/#std.fs.Dir I couldn't find anything like a doc comment explaining what concept Dir represents, if any.
How about os.Rooted?
"Rooted" has another common meaning too. ๐
I'm pretty convinced about the API choice of separating all these routines into methods on one type, but I think we need to keep thinking about the name of that type.
I can't think of a better name than Dir. Everything else is too overloaded. (FS is nice, but then that conflicts with io/fs.FS.)
I'm not entirely convinced that it's better to have two types rather than adding more methods to File, given that we already have extensive precedent for treating a File as a directory handle. I think there's going to be user confusion about when you want a File and when you want a Dir.
But perhaps it's fine.
openat(2) allows the absolute path to "escape" the Dir. Do we want to always use the RESOLVE_BENEATH, or maybe this should be configurable? This way the os.Dir might handle more use-cases (simple way to change the relative path location, but not restricting the use of absolute paths).
RE CWD
Zig has a CWD https://ziglang.org/documentation/master/std/#std.fs.cwd, in Zig you have to have a Dir to open a file. openat with AT_FDCWD works the same way as open does, so:
dir := os.CWD
dir.Open("foo")
os.Chdir("/tmp")
dir.Open("foo")
has the same behavior as:
os.Open("foo")
os.Chdir("/tmp")
os.Open("foo")
openat(2) allows the absolute path to "escape" the Dir. Do we want to always use the RESOLVE_BENEATH, or maybe this should be configurable? This way the os.Dir might handle more use-cases (simple way to change the relative path location, but not restricting the use of absolute paths).
I don't want to complicate this proposal with too many options. RESOLVE_BENEATH is the right semantics for almost all use cases. We can always add more options in the future if they turn out to be necessary.
I don't want to complicate this proposal with too many options. RESOLVE_BENEATH is the right semantics for almost all use cases. We can always add more options in the future if they turn out to be necessary.
Yes, of course, but this might influence the API, at this point we have two choices:
- options can be per-method, something like:
OpenFile(resolve ResolveOptions, file string) - options are in the
os.Dirstruct, so you have to create a newDirto change the resolve options.
If we don't consider that from the start, then we are only left with the second one.
Options should be in the Dir struct. You configure it, and then you use it with whatever methods you want.
Want two different kinds of restrictions? Configure two Dir structs.
Otherwise you need all the kinds of options on all kinds of different methods. The whole point of the Dir struct is to hold all that complexity and leave you with the same functions (methods) and signatures as the top-level os funcs.
openat2 was added in kernel 5.6, is is possible to support lower kernel versions? #67001
Linux is the only platform with openat2, so we need an openat-based implementation anyway. Linux versions without openat2 can use it, or we can just use openat everywhere if detecting support for openat2 is difficult.
I don't think we will need a way to turn a File into a Dir; start with the Dir and d.Open(".") if you really need a File.
This only works if you know ahead-of-time if the file you're opening is a directory. I have code that opens a path that may or may not be a directory, and calls File.Stat to decide whether to Read or ReadDir it. If it's a directory, I'd like to be able to convert the File to a Dir and use the new, safer, functions to open its children.
(Unless Dir were allowed to represent non-directories and Open(".") were special-cased, which I think would be quite unintuitive.)
The Dir type assumes you're rooted at a file descriptor, but the original design called for rooting at a string:
func OpenIn(parent, name string) (*File, error)Do Windows, Mac, and the BSDs have something comparable to Linux's openat, which uses file descriptors? If not, then perhaps we shouldn't constrain the API design based solely on how it would be implemented on Linux. The idea of rooting based on a string path seemed appealing to me.
Never mind, OpenDir(string).Open(string) is basically the same thing. :)
Do Windows, Mac, and the BSDs have something comparable to Linux's openat, which uses file descriptors?
Every Unix-like platform we support has openat, including macOS. Some of them have extensions which make implementation simpler (but then we still need to support the lowest common denominator, so it's not actually any simpler), but basic openat is enough.
Plan 9 doesn't have openat, but Plan 9 doesn't have symlinks so lexical path analysis suffices.
Windows has NtCreateFile, which can act quite a bit like openat. I'm going to write up a longer description of Windows path behavior at some point, but the quick version is that it's possible to build these APIs (in any of the versions proposed so far) on Windows as well.
GOOS=js does not have openat or any equivalent (as best I can ascertain), and we can therefore only support a degraded version of these APIs on that platform.
As you've noted, OpenIn is essentially a convenience version of OpenDir(parent).Open(file). I think it's okay for us to leave it out to keep the API surface down, but most of the real-world path traversal vulnerabilities I've observed would have been fixed by changing an os.Open(filepath.Join(parent, file)) call to os.OpenIn(parent, file). Making the safe path as simple as possible seems worthwhile to me.
Let's try Root. Then we can have:
func OpenRoot(dir string) (*Root, error)
func (*Root) FS() fs.FS
func (*Root) OpenFile
func (*Root) Create
func (*Root) Open
func (*Root) OpenDir
func (*Root) Mkdir
func (*Root) Remove
func (*Root) MkdirAll
func (*Root) RemoveAll
func (*Root) Close
func (*Root) Chmod
func (*Root) Chown
func (*Root) Chtimes
func (*Root) Lchown
func (*Root) Lstat
func (*Root) Readlink
func (*Root) Rename
func (*Root) Symlink
func (*Root) Truncate
And then there's an obvious name for the convenience function:
func OpenRooted(dir, name string) (*File, error) {
r, err := OpenRoot(dir)
if err != nil { return nil }
return r.Open(name)
}
Does that look right?
We may want to drop Root.Symlink or else we have to say what it rejects (like absolute paths).
(See the awful cases in #49580 if we go down that path.)
LGTM. I'm partway through implementing (using the name Dir), and haven't run into any significant concerns yet.
Root.Symlink seems fine; it creates a symlink in the root with the given target. The contents of the target are up to the caller. We can be clear in the documentation that Root.Symlink isn't trying to impose any restrictions on the link target.
I do not have a strong opinion on Dir vs. Root as a name. Dir is less overloaded, Root raises fewer questions about why Files can also be directories.
For what it's worth, Dir is also overloaded: you'd have to explain why OpenDir is different from opendir(3) at least.
I wanted to mention that I've been working on a Rust library (usable from C) that does this for the past few years (https://github.com/openSUSE/libpathrs). It has Go bindings and protects against a wide variety of attacks (including fairly esoteric /proc attacks that container runtimes care about).
(For context, I wrote openat2 for Linux. libpathrs was meant to be released around the same time but I was so busy with other things, which meant I've only gotten back to working on it in earnest this year. The basic API described by this proposal was basically done back in 2020, but I've reworked the API to be more easier to use from C and added a bunch of extra /proc hardening measures.)
For a pure Go implementation, there is also the new API I wrote for https://github.com/cyphar/filepath-securejoin (the insecure old API was a proposal in #20126). This only provides a safe open and MkdirAll primitive at the moment but the rest is fairly easy to add once you have that. It also has some /proc hardening measures, though on older kernels libpathrs is slightly more secure. This code is licensed under the same license as Go, so I can work on adapting it to be usable as part of the standard library if that's the path folks want to go.
(Both of these libraries only support Linux at the moment, but I'd be happy to take patches for Windows or FreeBSD support. I guess @neild would prefer this to be a pure Go library. I wrote libpathrs to be library usable from C so that it can get more widespread usage by other system tools like systemd and container runtimes not written in Go.)
If this API is going to be added to the stdlib, it should probably be a separate library for a while to make sure all of the bugs and API subtleties are ironed out. I don't want us to end up with an API that isn't actually usable by the vast majority of applications (container runtimes have very strict security requirements for these kinds of operations, and I would like to use something from the stdlib if it were to get merged there). There are also quite a few subtleties when implementing these APIs that you need to keep in mind. While most are fairly easy to write once you have a safe openat2-like primitive, operations like os.MkdirAll require a lot more work.
Regarding which RESOLVE_* behaviour is best:
RESOLVE_NO_SYMLINKSis not actually usable as an API for most people. The vast majority of tar archives and other filesystems contain symlinks. Programs in general need to resolve symlinks (there are cases where users want to block all symlink resolution, but that is the minority of cases).RESOLVE_BENEATHis fine for some users but tar extraction and container runtimes wantRESOLVE_IN_ROOTsemantics. However, for operating systems other than Linux (and FreeBSD, which has implementedRESOLVE_BENEATH-like semantics), implementingRESOLVE_BENEATHsemantics usingopenatis quite tricky. You cannot walk into..components without having something likereadlink("/proc/self/fd/$n")because walking into..components when an attacker is renaming directories will lead to breakouts. However, as I mention in this LWN article ~40% of symlinks on modern Unix systems contain ".." components.
At the moment, the yet-unreleased libpathrs code supports the vast majority of operations you need, with Go bindings. I'm currently working on implementing os.MkdirAll and will implement os.RemoveAll afterwards (though Go's implementation of the latter is already safe thanks to @kolyshkin.) github.com/cyphar/filepath-securejoin already implements most of the bits you want
func OpenRoot(dir string) (*Root, error)
func (*Root) FS() fs.FS
func (*Root) OpenFile
func (*Root) Create
func (*Root) Open
I wonder if there's overlap here with what FS could be? Perhaps an OpenFS(string) fs.FS could restrict operations to under its tree?
Thanks for the comments, @cyphar. They're fairly far ranging, so apologies if I miss anything:
"Secure join" as proposed in https://go.dev/issue/20126 is a misnomer, because there is no such thing as a "secure path" in the sense of a path which does not escape a directory root. A join operation that resolves symlinks is fundamentally vulnerable to TOCTOU races; it may be useful, but it cannot be considered "secure" (hardened against adversarial behavior). I understand that you know this, but I wanted to state it clearly.
"Safe open" is an ambiguous concept. If we want to restrict a filesystem operation (open, mkdir, etc.) to a certain directory, do we:
- Disallow symlinks entirely?
- Disallow symlinks that lead out of the directory?
- Disallow traversal of bind mounts?
- Disallow /proc magic links and similar filesystem concepts?
- Disallow device special files (/dev/*)?
- Disallow reserved filenames like COM1 on Windows?
Different programs have different needs. Container runtimes have a particular and specialized set of requirements, which I don't think I'm qualified to evaluate.
This proposal is not intended to satisfy every possible requirement for limiting filesystem operations to a certain location. Instead, it is focused on a common and relatively simple use case: Perform an operation within a directory, preventing escape via the filename or symlinks. It does not attempt to address bind mounts, /proc, or other such operations.
Symbolic links differ from these other constructs in that creating a symlink is generally not a privileged operation. Anyone can make a link pointing to /etc, or /sbin, or any other location. To the best of my knowledge, creating a bind mount, mounting a procfs filesystem, or creating a device special file requires root permissions on Unix systems. The functions in this proposal are not trying to defend against an attacker who already has root access.
Regarding RESOLVE_* behaviors:
- I agree that most programs do not want RESOLVE_NO_SYMLINKS. This proposal includes the equivalent as an optional feature (O_NOFOLLOW_ANY) for those that do.
- RESOLVE_IN_ROOT is a very specialized behavior, since it changes the interpretation of absolute links. It doesn't seem useful for anything other than Docker. If we want to support this behavior in the future as an option, we can, but I don't think it should be the default and I think this proposal is more than large enough as it stands without introducing it here.
The functions in this proposal may not directly satisfy all use cases, but they do provide an essential building block. We currently have no supported, portable access to the openat family of functions. The functions in this proposal will provide that.
A small update:
There are basically three groups of platforms:
- Ones with openat
- Windows
- Everything else, which is Plan 9 (which has no symlinks), and GOOS=js (which has no openat or equivalent).
I've implemented Root.OpenFile and Root.Mkdir for platforms with openat. I've got Windows mostly working, and I haven't dont Plan 9/JS yet but they should be straightforward. When I have a CL that works on all platforms, I'll clean it up and present it for review. I think OpenFile and Mkdir are sufficient to demonstrate the feasibility of the API.
Methods of Root must not allow escape through TOCTOU races, except where this is infeasible (GOOS=js). I know of two potential sources of races:
-
A path component is converted into a symlink after we examine it but before we traverse it. This is avoided by using openat with O_NOFOLLOW to walk down the path on openat platforms, or NtCreateFile with FILE_OPEN_REPARSE_POINT on Windows.
-
A directory is moved after we open it. For example, we open "a/b/c/../../d" (which does not escape), but the directory "a/b/c" is renamed to "a/b" after we traverse it. This is trickier to avoid. @cyphar's securejoin package defends against this race using a Linux-specific /proc filesystem feature, but this is too platform-specific to be an option for us. (On modern Linux kernels, we can use openat2, which does all the hard work for us, so it's not even very interesting as a Linux-specific optimization.)
(A reasonable question here is why we don't clean the path of .. components at the start, turning (for example) "a/../b" into just "b". This would change the semantics of the path when "a" is a symlink; for example if "a" is a link to "x/y", then "a/b" opens "x/b", not "b". In addition, we need to handle symlinks which themselves contain .. components.)
Another reasonable question is whether we can ignore all that and just say that Root always cleans its input paths. I feel that this isn't the correct behavior: Methods of os.Root should interpret paths in exactly the same way as the local platform does.)
My current implementation defends against this case by restarting the path lookup when it encounters .. components in the path. When opening "a/b/c/../../d", for example, it will open "a", "b", and "c" in sequence, then restart the operation from the root with the new path "a/d". This can be inefficient; "0/1/2/3/4/5/6/7/8/9/x/../target" will need to traverse the initial set of ten components twice. That's unfortunate, but it's reasonably straightforward and comprehensible. We limit the inefficiency by setting a cap on the number of path components that can be traversed in a single lookup operation.
If I'm missing any cases that need to be defended against, or if there's a simpler or more efficient portable approach to defending against directories moving mid-traversal, I'd very much like to hear about it.
@neild Is your WIP public anywhere? I'd be happy to help review it.
A directory is moved after we open it. For example, we open "a/b/c/../../d" (which does not escape), but the directory "a/b/c" is renamed to "a/b" after we traverse it. This is trickier to avoid.
Yeah, this is one of the harder parts to get right and is the reason that libpathrs and securejoin are Linux-only. It's not just renames, MS_MOVE of mounts can also cause issues.
When working on openat2, this was also one of the more annoying things to deal with. openat2 handles this by returning -EAGAIN if you try to walk into a .. and there has been a rename or mount on the system -- then userspace can retry as much as they need (which is fine because in-kernel lookups are much faster than the emulated ones we are doing). userspace can't do that optimisation properly so you need to check if every .. is outside of the root.
My current implementation defends against this case by restarting the path lookup when it encounters .. components in the path. [...] We limit the inefficiency by setting a cap on the number of path components that can be traversed in a single lookup operation.
I suspect it might make more sense to put a cap on the number of lookup restarts (i.e. the number of .. components), rather than number of path components.
(A reasonable question here is why we don't clean the path of .. components at the start, turning (for example) "a/../b" into just "b". This would change the semantics of the path when "a" is a symlink; for example if "a" is a link to "x/y", then "a/b" opens "x/b", not "b". In addition, we need to handle symlinks which themselves contain .. components.)
This behaviour is wrong on Linux -- symlinks are fully resolved component-by-component. Doing a filepath.Clean at any point where the path may contain symlink components will result in incorrect behaviour. I spent a few months fixing this exact bug in several Go projects.
a/../b is not always equivalent to b on Linux. If a is a/b/c then a/../b should be a/b/b. This gets more annoying with nested symlinks. Most users won't notice until they hit some weird bug where Go resolves things differently to the native platform (even worse, it could be based on the kernel version since openat2 does lookups the way you'd expect so only non-openat2 kernels would have the weird behaviour).
https://go.dev/cl/612136 is my current WIP. It's still very rough; I've only got it running on macOS and Windows so far, but I think the general structure is visible now.
This behaviour is wrong on Linux
Yes. If it wasn't clear, I was attempting to describe why early path cleaning is incorrect on Unix systems. (Windows, in contrast, cleans paths early in each operation.)
Change https://go.dev/cl/612136 mentions this issue: os: add Root
I believe https://go.dev/cl/612136 is now in a reviewable state.
It supports Root.OpenFile, Root.Open, Root.Create, Root.Mkdir, and root.OpenRoot on all supported platforms (or at least every platform we run trybots for by default). Extending it to other functions should be straightforward, but I think this suffices to verify the proposed API and implementation.
Proposed Root behavior, including some platform-specific caveats:
// Root represents a directory.
//
// Methods on Root can only access files and directories within that directory.
// If any component of a file name passed to a method of Root references a location
// outside the root, the method returns an error.
// File names may reference the directory itself (.).
//
// File names may contain symbolic links, but symbolic links may not
// reference a location outside the root.
// Symbolic links must not be absolute.
//
// Methods on Root do not prohibit traversal of filesystem boundaries,
// Linux bind mounts, /proc special files, or access to Unix device files.
//
// Methods on Root are safe to be used from multiple goroutines simultaneously.
//
// On most platforms, creating a Root opens a file descriptor or handle referencing
// the directory. If the directory is moved, methods on Root reference the original
// directory.
//
// Root's behavior differs on some platforms:
//
// - When GOOS=windows, file names may not reference Windows reserved device names
// such as NUL and COM1.
// - When GOOS=js, Root is vulnerable to TOCTOU (time-of-check-time-of-use)
// attacks in symlink validation, and cannot ensure that operations will not
// escape the root.
// - When GOOS=plan9 or GOOS=js, Root does not track directories across renames.
// On these platforms, a Root references a directory name, not a file descriptor.
The CL does not yet support more efficient kernel APIs such as Linux's openat2. This should also be straightforward to add as a followup.
The following additional methods of Root are part of this proposal, but not yet implemented.
func (*Root) FS() fs.FS
func (*Root) Remove
func (*Root) MkdirAll
func (*Root) RemoveAll
func (*Root) Chmod
func (*Root) Chown
func (*Root) Chtimes
func (*Root) Lchown
func (*Root) Lstat
func (*Root) Readlink
func (*Root) Rename
func (*Root) Stat
func (*Root) Symlink
func (*Root) Truncate
Plus @rsc's proposed convenience function:
func OpenRooted(dir, name string) (*File, error) {
r, err := OpenRoot(dir)
if err != nil { return nil }
return r.Open(name)
}
Two questions on the convenience function:
- I like OpenInRoot a bit better than OpenRooted, but I'm fine with either. Which one should we use?
- Should we also have CreateInRoot/CreateRooted?
Thank you for putting together the existence proof that this is actually possible. :)
OpenInRoot sounds good. It seems to better suggest what the function does than OpenRooted.
I haven't checked, but I suspect that Open is used far more often than Create, suggesting less need for CreateInRoot. We can also of course add CreateInRoot later.
@neild , the OpenDir and Close methods were in @rsc's list above, but missing from your comment. I just wanted to check that the omission wasn't intentional.
Sorry, that was confusing: I was listing the methods not implemented in https://go.dev/cl/612136. (@rsc's list contains OpenDir, but I think that's intended to be OpenRoot.) OpenRoot and Close are part of the proposed list.
(Also Open, Create, OpenFile, and Mkdir.)
Have all remaining concerns about this proposal been addressed?
The proposal is:
package os
// Root represents a directory.
//
// Methods on Root can only access files and directories within that directory.
// If any component of a file name passed to a method of Root references a location
// outside the root, the method returns an error.
// File names may reference the directory itself (.).
//
// File names may contain symbolic links, but symbolic links may not
// reference a location outside the root.
// Symbolic links must not be absolute.
//
// Methods on Root do not prohibit traversal of filesystem boundaries,
// Linux bind mounts, /proc special files, or access to Unix device files.
//
// Methods on Root are safe to be used from multiple goroutines simultaneously.
//
// On most platforms, creating a Root opens a file descriptor or handle referencing
// the directory. If the directory is moved, methods on Root reference the original
// directory.
//
// Root's behavior differs on some platforms:
//
// - When GOOS=windows, file names may not reference Windows reserved device names
// such as NUL and COM1.
// - When GOOS=js, Root is vulnerable to TOCTOU (time-of-check-time-of-use)
// attacks in symlink validation, and cannot ensure that operations will not
// escape the root.
// - When GOOS=plan9 or GOOS=js, Root does not track directories across renames.
// On these platforms, a Root references a directory name, not a file descriptor
type Root struct { ... }
func OpenRoot(dir string) (*Root, error)
func (*Root) FS() fs.FS
func (*Root) OpenFile
func (*Root) Create
func (*Root) Open
func (*Root) OpenRoot
func (*Root) Close
func (*Root) Mkdir
func (*Root) Remove
func (*Root) MkdirAll
func (*Root) RemoveAll
func (*Root) Chmod
func (*Root) Chown
func (*Root) Chtimes
func (*Root) Lchown
func (*Root) Lstat
func (*Root) Readlink
func (*Root) Rename
func (*Root) Stat
func (*Root) Symlink
func (*Root) Truncate
func OpenInRoot(dir, name string) (*File, error) {
r, err := OpenRoot(dir)
if err != nil { return nil }
return r.Open(name)
}A few comments:
-
It seems strange to have a
Truncateprimitive (whenftruncate(2)exists) while not having alink(2)primitive (do other operating systems not haveftruncate(2)?). I guess not all operating systems have hardlinks? -
This is also somewhat Linux-specific but one other thing to consider is whether
O_NOFOLLOWbehaviour is something we want to support and how the API for that should look. Being able to reference a symlink using anO_PATHis quite useful on Linux, and emulating it is a little ugly (though possible, to be fair). -
I also am not sure about
(*Root).OpenRoot. If the security model is that the contents of aRootare untrusted then it is absolutely not safe to create aRootfrom an attacker-controlled directory. Even in-kernel implementations ofRESOLVE_BENEATHare not completely safe against an attacker tricking you into resolving outside the root by moving the root during resolution. Surely this can be added later if there is actual evidence someone wants this (mis)feature? -
Regarding
MkdirAll-- one fairly annoying aspect of this is dealing with dangling symlinks. Naive userspace resolvers act differently to in-kernel resolvers when dealing with dangling symlinks in theMkdirAllcase, and implementing this correctly is somewhat complicated (see the symlink stack code in filepath-securejoin and libpathrs). If you ever want to support in-kernel resolvers it is necessary to ensure that you don't permitMkdirAllwith dangling symlinks, which requires that kind of symlink stack emulation. It's up to you if you feel this kind of complexity belongs in the Go stdlib. -
I also have some misgivings about
GOOS=js. If the goal of this is to be safe against symlink attacks, then surely the best policy is to return an error (preferably at compile-time) on systems where we can't ensure that? -
As for
RemoveAll,os.RemoveAllis now safe on Linux but not yet on Windows (see #52745) so calling into theos.RemoveAllimplementation on the internalRootfile descriptor is probably the simplest way of implementing this.
It seems strange to have a Truncate primitive (when ftruncate(2) exists) while not having a link(2) primitive (do other operating systems not have ftruncate(2)?).
I think this is an oversight. The list includes every top-level os function which operates on files. We have os.Link, so we should have Root.Link as well:
func (*Root) Link
I agree that Root.Truncate doesn't seem very necessary when File.Truncate exists, but supporting every file operation on a Root is simpler than picking and choosing the "necessary" ones.
This is also somewhat Linux-specific but one other thing to consider is whether O_NOFOLLOW behaviour is something we want to support and how the API for that should look.
Root.OpenFile can be used with O_NOFOLLOW, just like the top-level OpenFIle:
r, err := os.OpenRoot("directory")
f, err := r.OpenFile("a/b/c", os.O_NOFOLLOW, 0) // a and a/b might still be symlinks, c must not b
If we want to offer alternate ways of opening files in the future, we can add methods to Root to change its behavior. For example:
// This is not part of the current proposal.
// It is an example of something we might propose in the future.
r, err := os.OpenRoot("directory")
r.SetNofollowAll(true) // do not follow any symlinks
I also am not sure about (*Root).OpenRoot. If the security model is that the contents of a Root are untrusted then it is absolutely not safe to create a Root from an attacker-controlled directory. Even in-kernel implementations of RESOLVE_BENEATH are not completely safe against an attacker tricking you into resolving outside the root by moving the root during resolution. Surely this can be added later if there is actual evidence someone wants this (mis)feature?
I don't understand the attack vector here. Can you explain in more detail?
A Root prohibits escapes via path traversal and symlinks. If you Root.OpenRoot a subdirectory, the new Root prohibits escapes from that subdirectory. Moving the directory doesn't change that.
Root.OpenRoot is essentially open with O_PATH, and is useful for efficiently implementing functions which traverse a directory tree such as os.RemoveAll.
Regarding MkdirAll -- one fairly annoying aspect of this is dealing with dangling symlinks. Naive userspace resolvers act differently to in-kernel resolvers when dealing with dangling symlinks in the MkdirAll case, and implementing this correctly is somewhat complicated (see the symlink stack code in filepath-securejoin and libpathrs). If you ever want to support in-kernel resolvers it is necessary to ensure that you don't permit MkdirAll with dangling symlinks, which requires that kind of symlink stack emulation. It's up to you if you feel this kind of complexity belongs in the Go stdlib.
I'm afraid I don't understand the concern here. There is no syscall (so far as I know) that implements MkdirAll, so whatever behavior we have here will be in user space.
Is a "naive userspace resolver" one which doesn't implement symlink traversal correctly? If so, I do not believe the path resolution in https://go.dev/cl/612136 is naive.
I also have some misgivings about GOOS=js. If the goal of this is to be safe against symlink attacks, then surely the best policy is to return an error (preferably at compile-time) on systems where we can't ensure that?
Root defends against three classes of attacks:
- Path name traversal, where a filename like "../../../etc/passwd" escapes from the intended root.
- Static symlink traversal, where an attacker instructs a program to create a link out of a root and then traverses it. For example, an attacker might provide a tar archive containing a symlink which the victim then extracts and traverses.
- Dynamic attacks, where an attacker is actively modifying the filesystem to exploit TOCTOU races in the victim.
Without openat or some equivalent, we can still defend against path name traversal and symlink traversal, but we are vulnerable to TOCTOU races in symlink traversal.
There is benefit in defending against traversal. Of the real vulnerabilities that I've seen that Root might defend against, I think a strict majority have been in cases where TOCTOU was not a concern. For example, CVE-2024-3000 was a real and significant vulnerability that didn't involve symlinks at all.
In my initial version of this proposal, I proposed that the new API return an error on GOOS=js. Subsequent discussion convinced me that this was a mistake, and that we should implement as much of the new API as is feasible on each platform. Not supporting os.Root on some platforms will give users a reason not to use it at all.
As for RemoveAll, os.RemoveAll is now safe on Linux but not yet on Windows (see os: RemoveAll susceptible to symlink race #52745) so calling into the os.RemoveAll implementation on the internal Root file descriptor is probably the simplest way of implementing this.
Yes, os.RemoveAll and os.Root.RemoveAll will share a common implementation, and we will fix #52745 (except for GOOS=js) as part of this proposal.
Change https://go.dev/cl/617378 mentions this issue: syscall, internal/syscall/unix: add Openat support for wasip1
Change https://go.dev/cl/617376 mentions this issue: internal/syscall/unix: add Mkdirat and Readlinkat
Change https://go.dev/cl/617377 mentions this issue: internal/syscall/windows: add NtCreateFile
I'm afraid I don't understand the concern here. There is no syscall (so far as I know) that implements MkdirAll, so whatever behavior we have here will be in user space.
Is a "naive userspace resolver" one which doesn't implement symlink traversal correctly? If so, I do not believe the path resolution in https://go.dev/cl/612136 is naive.
Explaining this before you see the issue in practice is a little complicated, but I'll try my best:
The way I implemented MkdirAll was to resolve as many components and then mkdir the remaining components (this is the obvious way of doing it, so I suspect you plan to implement it that way too). In filepath-securejoin this is done with partialLookupInRoot, and in libpathrs this is resolvers::opath::resolve_partial.
The problem is that if the userspace symlink resolver just prepends symlinks to the "remaining path" (which is what your current resolver does), if you hit a dangling symlink the userspace implementation will have behaviour that won't match an openat2-based implementation. For a concrete example:
% mkdir -p tmp/a tmp/b/c
% ln -s ../b/doesnotexist/foo/bar tmp/a/foo
% mkdir -p tmp/a/foo/baz
The final mkdir -p will fail if you implement an openat2 based resolver because resolving tmp/a/foo will fail with ENOENT but it exists so MkdirAll will have to fail with EEXIST or something similar. However, a userspace resolver that doesn't track whether or not you are in the middle of symlink resolution would return a partial lookup of "existingpath=tmp/b remainingpath=/doesnotexist/foo/bar/baz" and so you would end up creating the directory tmp/b/doesnotexist/foo/bar/baz.
The solution I ended up with was creating a "symlink stack" that tracks whether or not the userspace resolver is still in the middle of symlink resolution. The PRs and code I linked to show how I did it in case having a reference might help. https://github.com/cyphar/filepath-securejoin/blob/main/lookup_linux.go is the "non naive" resolver I ended up with as a result.
I only mentioned it because it will make your resolver a fair bit more complicated, and I was wondering whether it makes to have this somewhat ugly thing in the stdlib in the first pass of this API. But then again, os.MkdirAll is kind of important to support (that's why I added it to libpathrs and filepath-securejoin) so I guess it's unavoidable.
I don't understand the attack vector here. Can you explain in more detail?
A Root prohibits escapes via path traversal and symlinks. If you Root.OpenRoot a subdirectory, the new Root prohibits escapes from that subdirectory. Moving the directory doesn't change that.
Root.OpenRoot is essentially open with O_PATH, and is useful for efficiently implementing functions which traverse a directory tree such as os.RemoveAll.
The attacks are similar to chroot-based attacks, when working on RESOLVE_BENEATH there were several attacks I had to find solutions for and while I think that the current implementation is reasonable (it will abort if there is a rename or mount on the system and there is a special path_is_under check before returning), it is entirely possible for there to be some other attack we haven't considered.
The rename attack I was talking about is that the attacker can move the root of the resolution in a way that .. never explicitly crosses the original root point and can thus escape a-la chroot. The practical attack would look like:
- Open root
/foo/bar(/foois attacker-controlled). - Try to resolve
a/../../b. - Attacker moves
/foo/bar/ato/foo/aafter you walk intoa. - The
..escapes/foo/bar.
Now, your retry-based implementation is not immediately vulnerable to this particular attack, but that doesn't mean there isn't some other similar attack that we just can't think of at the moment.
To paraphrase Jann Horn's comments on openat2, we really should not encourage users to try to do scoped resolution inside an attacker-controlled directory. Making this an explicit part of a stdlib API that can never be removed seems somewhat ill-advised to me. Maybe we can add this later if someone actually needs it?
Of the real vulnerabilities that I've seen that Root might defend against, I think a strict majority have been in cases where TOCTOU was not a concern.
I guess this depends on the projects you work on. My experience is that while non-TOCTOU cases do happen a fair bit1, a lot of programs are run by users in contexts where the TOCTOU case becomes important at some point in the future.
You're quite right that securing the more egregious cases is worth doing, but I just can't shake the concern that having slightly different security semantics on different platforms is something that is going to bite people, even if it is documented.
1: One of my first patches against Docker a decade ago was fixing a non-TOCTOU ../../../../ bug in moby/moby#5720 and a non-racing symlink bug in moby/moby#8000 and I've fixed several similar issues in runc over the years (though in the past few years we've been dealing with TOCTOUs, hence why I started working on openat2 and libpathrs).
However, a userspace resolver that doesn't track whether or not you are in the middle of symlink resolution would return a partial lookup of "existingpath=tmp/b remainingpath=/doesnotexist/foo/bar/baz" and so you would end up creating the directory tmp/b/doesnotexist/foo/bar/baz.
It sounds like you're describing an implementation of Root.MkdirAll that's something like:
- Convert the input path into a new path with all symlinks resolved.
- os.MkdirAll this path.
That seems obviously incorrect: Mkdir on a dangling symlink is supposed to fail, but this follows symlinks. It also doesn't seem any simpler than a correct implementation.
A simple and (I believe) correct implementation would be to take the current os.MkdirAll and convert it to call Root.Mkdir rather than os.Mkdir. This implementation is:
- If the input path has a parent directory, recursively MkdirAll the parent.
- Mkdir the input path.
Within a Root, this can be fairly inefficient, because each Mkdir may traverse the entire directory tree. A more efficient approach can walk the hierarchy only once:
- Let
curbe the root. - For each component
pin the input path:cur.Mkdir(p)- Replace
curwithcur.OpenRoot(p).
(We then need a bit of complexity to handle MkdirAll("a/../b"), but it's not too bad.)
I sketched an implementation of this approach in https://go.dev/cl/619076.
To paraphrase Jann Horn's comments on openat2, we really should not encourage users to try to do scoped resolution inside an attacker-controlled directory. Making this an explicit part of a stdlib API that can never be removed seems somewhat ill-advised to me. Maybe we can add this later if someone actually needs it?
Root.OpenRoot is useful for efficient path-walking, since it lets you avoid unnecessary redundant lookups. It's also just convenient for many tasks. For example, the MkdirAll implementation I linked above (https://go.dev/cl/619076) uses successive OpenRoot calls to walk down a path.
It is, of course, possible--probable, even--that we're going to discover attacks that we haven't accounted for. I don't see, however, how Root.OpenRoot is any more likely to be vulnerable than Root.Open. It's exactly the same operation; both open a file. And once you have the file opened, a Root created with Root.OpenRoot is no different from one created with os.OpenRoot.
It sounds like you're describing an implementation of Root.MkdirAll that's something like:
Convert the input path into a new path with all symlinks resolved.
os.MkdirAll this path.That seems obviously incorrect: Mkdir on a dangling symlink is supposed to fail, but this follows symlinks. It also doesn't seem any simpler than a correct implementation.
The second step is done with a very restricted MkdirAll that doesn't allow .. or symlinks, so it's not vulnerable to attacks. But I now see why you want Root.OpenRoot.
A simple and (I believe) correct implementation would be to take the current os.MkdirAll and convert it to call Root.Mkdir rather than os.Mkdir. This implementation is:
If the input path has a parent directory, recursively MkdirAll the parent. Mkdir the input path.
That does work but now you're looking at O(n^3) complexity for a path that is potentially attacker-controlled and is not restricted by PATH_MAX.
Also (and to be fair -- this is a little esoteric and might not be a real problem, but) an operation like MkdirAll("a/b/../c/../d/../e/../f/../g/../") will create all of the directories (a/b, a/c, a/d). Sure, you're creating directories anyway but depending on the particular setup, you could imagine this making it easier to do an DoS by creating directories with lots of entries in a single directory (most filesystems struggle with a single directory containing millions of dentries, and this could be used as a way of amplifying such an attack).
Within a Root, this can be fairly inefficient, because each Mkdir may traverse the entire directory tree. A more efficient approach can walk the hierarchy only once:
Unless I'm missing something this also won't work for symlinks that contain .. (or symlinks to symlinks that contain .., etc etc) and that will be harder to handle than the trivial a/../b in the path argument case.
(FWIW this approach also just doesn't work at all for RESOLVE_IN_ROOT but that's not the usecase you have so w/e.)
I don't see, however, how Root.OpenRoot is any more likely to be vulnerable than Root.Open. It's exactly the same operation; both open a file. And once you have the file opened, a Root created with Root.OpenRoot is no different from one created with os.OpenRoot.
The difference is that Root.OpenRoot is operating on an attacker-controlled directory, while os.OpenRoot is almost always going to be called on a path in an administrator-controlled directory (if you call it on a path in an attacker-controlled directory they can trick you into operating on any host path).
This means an attacker can do rename operations (among other things) on the root itself for Roots created with root.RootOpen while they cannot do that on Roots created with os.OpenRoot.
Root.OpenRoot is useful for efficient path-walking, since it lets you avoid unnecessary redundant lookups. It's also just convenient for many tasks.
How about making it private until someone shows up that has a strong usecase for why it should be part of the API in a way that encourages users to do something that is potentially less safe?
Change https://go.dev/cl/619435 mentions this issue: internal/syscall/windows: add Openat
Unless I'm missing something this also won't work for symlinks that contain .. (or symlinks to symlinks that contain .., etc etc) and that will be harder to handle than the trivial a/../b in the path argument case.
Good point (that'll teach me to put together an example CL without writing tests). The real implementation will need a bit more complexity to handle that case. (I suspect the real implementation will involve generalizing doInRoot from https://go.dev/cl/612136 a bit, since it already handles the necessary bookkeeping.)
How about making it private until someone shows up that has a strong usecase for why it should be part of the API in a way that encourages users to do something that is potentially less safe?
I still don't see how Root.OpenRoot is less safe.
A Root, on platforms with openat, is essentially a file descriptor. Root.Open and Root.OpenRoot both open a file descriptor within a Root; the only difference is the type returned (*File or *Root).
I don't see a scenario where Root.Open and Root.OpenRoot don't share the same set of vulnerabilities: If one can escape the root in some scenario, so can the other. So adding OpenRoot doesn't increase the attack surface. And once a Root is created, it doesn't matter whether it was created by os.OpenRoot or Root.OpenRoot; in both cases, it's just a file descriptor.
Change https://go.dev/cl/620157 mentions this issue: os: use relative paths in a test dir in TestOpenError
A Root, on platforms with openat, is essentially a file descriptor. Root.Open and Root.OpenRoot both open a file descriptor within a Root; the only difference is the type returned (*File or *Root).
The difference is that by design Root.OpenRoot is creating a file descriptor that an attacker can rename. This ability can lead to attacks in other domains -- there is an infamous chroot breakout that relies on this. I don't see a breakout attack at the moment, I'm just suggesting taking a more cautious approach since you can't remove an API from stdlib.
Late to the discussion, but I thought I'd point out that the hcsshim project has an implementation of a similar thing for Windows, used when extracting container images to ensure reparse points cannot be used to break out of the container image's data directory.
Because in that case following a reparse point is always invalid, their version is simpler and rejects any reparse points it encounters, so it does not need to resolve or validate the resulting path for traversal issues.
It's public operations are: OpenRoot, OpenRelative, LinkRelative, RemoveRelative, RemoveAllRelative, MkdirRelative, MkdirAllRelative, LstatRelative, and EnsureNotReparsePointRelative. (That last one is a simple helper, roughly "Could I OpenRelative this path or children thereof successfully?")
One specific thing worth calling out is that LinkRelative is actually func LinkRelative(oldname string, oldroot *os.File, newname string, newroot *os.File), i.e. it can be used to create hardlinks between two different Roots, in this case between the currently-extracted container image's data directory, and a parent container image's data directory.
Apart from that and OpenRelative taking Windows-specific flags (it wraps NtCreateFile pretty loosely) the API seems to match the proposal here.
That said, I'm not sure if the hcsshim code could be reimplemented on top of this new feature because of the requirement to reject any reparse points encountered, rather than resolve them.
Change https://go.dev/cl/620576 mentions this issue: internal/syscall/windows: set write access when O_TRUNC is used
For consistency with #49580, should the Readlink method be ReadLink? Or maybe this is an argument it should be Readlink in #49580.
For the record, os.Root has a somewhat uncomfortable relationship with the io/fs package, though I think it should have little to no effect on this proposal. Looking back, at some point before #67002 (comment), it looks like we resolved some of this tension by adding an FS() fs.FS method, but I didn't see any reasoning.
There seem to be two questions here: Should os.Root implement fs.FS? And should there be an io/fs interface for file systems that support OpenRoot?
os.Root could implement fs.FS, but unfortunately because Go doesn't support covariant method satisfaction, this would require the methods on os.Root to return interface types rather than concrete types. That would in turn narrow them to the read-only APIs provided by fs, unless the caller performed a cumbersome dynamic type assertion.
I think we could add an fs interface supporting OpenRoot. Akin to the other interfaces in fs, OpenRoot would return an fs.FS. os.Root couldn't implement this interface for the reason given above, though the result of (*Root).FS could.
I think we should have Root.Readlink/os.Readlink for consistency.
If we want Root to be consistent with io/fs.ReadLinkFS, then we should make os.Readlink consistent as well. Which I guess is either add os.ReadLink and deprecate os.Readlink, or use Readlink in io/fs. I think using Readlink everywhere is the least amount of ecosystem churn, so I'd lean that way.
As you say, os.Root can't implement io/fs.FS (unless we want to take a detour into adding covariant method satisfaction). The Root.FS method provides a traversal-resistant alternative to os.DirFS.
It might make sense to have an io/fs interface for OpenRoot, but I think that can be a separate proposal. There are many other operations (mostly writes) supported by a Root that don't have an io/fs equivalent.
Should Root have a ReadDir method? The current proposal (#67002 (comment)) doesn't have one.
You can currently list the contents of a Root with Root.Open(".") to get the directory file, and then File.ReadDir. A Root.ReadDir method would be more convenient and efficient.
If we do have a Root.ReadDir, should it match the signature of io/fs.ReadDirFS.ReadDir? (I'm guessing yes.)
func (r *Root) ReadDir(name string) ([]DirEntry, error)
We can add (*Root).Link. ReadDir is another question, but that can easily be a minor follow-up if necessary and as @neild pointed out, you do have access to File.ReadDir.
There's been a fair amount of discussion since I checked for comments. Any further comments?
Presumably you can just use fs.ReadDir(root.FS(), name) which seems short enough already to not need more API.
Have all remaining concerns about this proposal been addressed?
The proposal is:
package os
// Root represents a directory.
//
// Methods on Root can only access files and directories within that directory.
// If any component of a file name passed to a method of Root references a location
// outside the root, the method returns an error.
// File names may reference the directory itself (.).
//
// File names may contain symbolic links, but symbolic links may not
// reference a location outside the root.
// Symbolic links must not be absolute.
//
// Methods on Root do not prohibit traversal of filesystem boundaries,
// Linux bind mounts, /proc special files, or access to Unix device files.
//
// Methods on Root are safe to be used from multiple goroutines simultaneously.
//
// On most platforms, creating a Root opens a file descriptor or handle referencing
// the directory. If the directory is moved, methods on Root reference the original
// directory.
//
// Root's behavior differs on some platforms:
//
// - When GOOS=windows, file names may not reference Windows reserved device names
// such as NUL and COM1.
// - When GOOS=js, Root is vulnerable to TOCTOU (time-of-check-time-of-use)
// attacks in symlink validation, and cannot ensure that operations will not
// escape the root.
// - When GOOS=plan9 or GOOS=js, Root does not track directories across renames.
// On these platforms, a Root references a directory name, not a file descriptor
type Root struct { ... }
func OpenRoot(dir string) (*Root, error)
func (*Root) FS() fs.FS
func (*Root) OpenFile
func (*Root) Create
func (*Root) Open
func (*Root) OpenRoot
func (*Root) Close
func (*Root) Mkdir
func (*Root) Remove
func (*Root) MkdirAll
func (*Root) RemoveAll
func (*Root) Chmod
func (*Root) Chown
func (*Root) Chtimes
func (*Root) Lchown
func (*Root) Lstat
func (*Root) Readlink
func (*Root) Rename
func (*Root) Stat
func (*Root) Symlink
func (*Root) Link
func (*Root) Truncate
func OpenInRoot(dir, name string) (*File, error) {
r, err := OpenRoot(dir)
if err != nil { return nil }
return r.Open(name)
}Based on the discussion above, this proposal seems like a likely accept.
The proposal is:
package os
// Root represents a directory.
//
// Methods on Root can only access files and directories within that directory.
// If any component of a file name passed to a method of Root references a location
// outside the root, the method returns an error.
// File names may reference the directory itself (.).
//
// File names may contain symbolic links, but symbolic links may not
// reference a location outside the root.
// Symbolic links must not be absolute.
//
// Methods on Root do not prohibit traversal of filesystem boundaries,
// Linux bind mounts, /proc special files, or access to Unix device files.
//
// Methods on Root are safe to be used from multiple goroutines simultaneously.
//
// On most platforms, creating a Root opens a file descriptor or handle referencing
// the directory. If the directory is moved, methods on Root reference the original
// directory.
//
// Root's behavior differs on some platforms:
//
// - When GOOS=windows, file names may not reference Windows reserved device names
// such as NUL and COM1.
// - When GOOS=js, Root is vulnerable to TOCTOU (time-of-check-time-of-use)
// attacks in symlink validation, and cannot ensure that operations will not
// escape the root.
// - When GOOS=plan9 or GOOS=js, Root does not track directories across renames.
// On these platforms, a Root references a directory name, not a file descriptor
type Root struct { ... }
func OpenRoot(dir string) (*Root, error)
func (*Root) FS() fs.FS
func (*Root) OpenFile
func (*Root) Create
func (*Root) Open
func (*Root) OpenRoot
func (*Root) Close
func (*Root) Mkdir
func (*Root) Remove
func (*Root) MkdirAll
func (*Root) RemoveAll
func (*Root) Chmod
func (*Root) Chown
func (*Root) Chtimes
func (*Root) Lchown
func (*Root) Lstat
func (*Root) Readlink
func (*Root) Rename
func (*Root) Stat
func (*Root) Symlink
func (*Root) Link
func (*Root) Truncate
func OpenInRoot(dir, name string) (*File, error) {
r, err := OpenRoot(dir)
if err != nil { return nil }
return r.Open(name)
}The concern I had about Root.OpenRoot hasn't really been addressed (the history of chroot-style attacks indicates that this API could result in security issues), but I understand that @neild doesn't seem to consider it an issue so ๐คท. Just leaving this here for the record.
@cyphar, can you point to an explanation of the chroot breakout you're referring to? I'm not familiar with it.
No change in consensus, so accepted. ๐
This issue now tracks the work of implementing the proposal.
The proposal is:
package os
// Root represents a directory.
//
// Methods on Root can only access files and directories within that directory.
// If any component of a file name passed to a method of Root references a location
// outside the root, the method returns an error.
// File names may reference the directory itself (.).
//
// File names may contain symbolic links, but symbolic links may not
// reference a location outside the root.
// Symbolic links must not be absolute.
//
// Methods on Root do not prohibit traversal of filesystem boundaries,
// Linux bind mounts, /proc special files, or access to Unix device files.
//
// Methods on Root are safe to be used from multiple goroutines simultaneously.
//
// On most platforms, creating a Root opens a file descriptor or handle referencing
// the directory. If the directory is moved, methods on Root reference the original
// directory.
//
// Root's behavior differs on some platforms:
//
// - When GOOS=windows, file names may not reference Windows reserved device names
// such as NUL and COM1.
// - When GOOS=js, Root is vulnerable to TOCTOU (time-of-check-time-of-use)
// attacks in symlink validation, and cannot ensure that operations will not
// escape the root.
// - When GOOS=plan9 or GOOS=js, Root does not track directories across renames.
// On these platforms, a Root references a directory name, not a file descriptor
type Root struct { ... }
func OpenRoot(dir string) (*Root, error)
func (*Root) FS() fs.FS
func (*Root) OpenFile
func (*Root) Create
func (*Root) Open
func (*Root) OpenRoot
func (*Root) Close
func (*Root) Mkdir
func (*Root) Remove
func (*Root) MkdirAll
func (*Root) RemoveAll
func (*Root) Chmod
func (*Root) Chown
func (*Root) Chtimes
func (*Root) Lchown
func (*Root) Lstat
func (*Root) Readlink
func (*Root) Rename
func (*Root) Stat
func (*Root) Symlink
func (*Root) Link
func (*Root) Truncate
func OpenInRoot(dir, name string) (*File, error) {
r, err := OpenRoot(dir)
if err != nil { return nil }
return r.Open(name)
}Change https://go.dev/cl/627076 mentions this issue: os: add Root.Remove
Change https://go.dev/cl/627475 mentions this issue: os: add Root.Stat
Change https://go.dev/cl/629518 mentions this issue: os: add Root.FS
Change https://go.dev/cl/629555 mentions this issue: os: add OpenInRoot
Change https://go.dev/cl/629519 mentions this issue: os: add Root.Readlink
Change https://go.dev/cl/629698 mentions this issue: os: add Root.RemoveAll, avoid symlink race in RemoveAll on Windows
My current intent is to submit a subset of this proposal for 1.24, with the remainder following in 1.25. 1.24 will contain:
type Root struct { ... }
func OpenRoot(dir string) (*Root, error)
func (*Root) OpenFile
func (*Root) Create
func (*Root) Open
func (*Root) OpenRoot
func (*Root) Close
func (*Root) Mkdir
func (*Root) Remove
func (*Root) Lstat
func (*Root) Stat
func OpenInRoot(dir, name string) (*File, error)That will leave the following functions for 1.25:
func (*Root) MkdirAll
func (*Root) RemoveAll
func (*Root) Chmod
func (*Root) Chown
func (*Root) Chtimes
func (*Root) Lchown
func (*Root) Readlink
func (*Root) Rename
func (*Root) Symlink
func (*Root) Link
func (*Root) TruncateThe implementation in 1.24 will support all our ports (with the caveats mentioned above for GOOS=js and GOOS=plan9), but not does not take advantage of platform-specific features such as Linux's RESOLVE_BENEATH and Darwin's O_NOFOLLOW_ANY which allow for a more efficient implementation. That will also be a task for 1.25.
AI-generated issue overview
The original post proposes new functions and methods in the os package to enhance file opening safety and mitigate vulnerabilities like directory traversal and unintended symlink traversal. (@neild, issue)
The proposal includes:
- New functions like
OpenFileIn,CreateIn, andOpenInto safely open files within a specified directory, preventing access outside of it. These functions aim to defend against directory traversal, symlinks to external locations, and access to Windows device files. (@neild, issue) - Corresponding methods for the
os.Filetype:File.OpenFileIn,File.CreateIn,File.OpenIn, with similar functionality but relative to the directory associated with the file. (@neild, issue) - A new constant,
O_NOFOLLOW_ANY, as a flag to disable symlink resolution entirely within the filename passed toOpenFile,OpenFileIn, orFile.OpenFile. (@neild, issue) - New traversal-safe functions:
MkdirIn,RemoveIn,StatIn, and their correspondingFilemethods, to cover creating directories, removing files, and getting file information within a given directory. (@neild, issue) os.DirFSIn, a secure version ofos.DirFS, for traversal-resistant file system operations. This aims to address the limitations ofos.DirFSregarding symlink following andChdirinteractions. (@neild, issue)
Discussion Themes
- Relationship to
safeopenand Need for the Proposal: The proposal is acknowledged to be influenced by thesafeopenpackage but introduces new functionalities likeFile.Openmethods andO_NOFOLLOW_ANY, addressing scenarios not covered bysafeopen. (@neild, issue) The need for OS-level support is highlighted due to difficulties in correctly implementing safe file operations without it, particularly in preventing TOCTOU bugs. (@dsnet, issue) - Naming and API Design: Suggestions are made to include suffixes like "At" or "In" in function and method names for clarity. The final proposal leans toward "In" to distinguish them from the standard
openatsystem call and to emphasize their restrictive behavior. (@adonovan, issue), (@neild, issue) Discussion also arises around extending these APIs to theio/fspackage and the possibility of a newfs.OpenFileinterface. (@dsnet, issue), (@neild, issue) - Handling of
..and Symlinks: Different approaches are discussed for handling relative path components (..) and symlinks, including resolving them sequentially like Unixopenat, cleaning the path lexically likesafeopen, disallowing them altogether, or disallowing symlink resolution. The proposal leans towards sequential resolution while acknowledging the potential complexity on Windows and the argument for disallowing.and..for simplification. (@neild, issue) - Platform-Specific Behavior and
openatEquivalents: The proposal acknowledges the lack ofopenatequivalents on some platforms (js, plan9) and proposes different handling strategies, including returning errors or falling back to less secure implementations. The final proposal leans toward providing maximum API support on all platforms, even with limitations. (@magical, issue), (@neild, issue), (@neild, issue) - API Scope and
os.RootType: The extensive API proposed initially prompts suggestions for a separate package (os/in) or a dedicated type likeDirorRoot. The proposal eventually converges on theos.Roottype to encapsulate the safety-related methods. Discussion also involves the inclusion of additional methods likeRemoveIn,RemoveAllIn, and the handling ofO_NOFOLLOW. (@bjorndm, issue), (@rsc, issue), (@neild, issue) Concerns about the security implications ofRoot.OpenRoot, similar tochrootbreakouts, are raised but not fully addressed in the final proposal. (@cyphar, issue)
Resolution
The proposal is accepted. The implementation will be rolled out in two phases: a subset of the API in Go 1.24 and the remaining features in Go 1.25. The initial release will focus on core functionalities like OpenRoot, OpenFile, Create, Open, OpenRoot, Close, Mkdir, Remove, Lstat, Stat, and the convenience function OpenInRoot. Later additions will include other Root methods like MkdirAll, RemoveAll, Chmod, and platform-specific optimizations. (@aclements, issue comment), (@neild, issue comment)
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)
Change https://go.dev/cl/645718 mentions this issue: os: add Root.Chmod