golang/go

os: treat all non-symlink reparse points as ModeIrregular on Windows

Closed this issue Β· 21 comments

fs.ModeIrregular is useful to know whether a file is a reparse point or not. It is a signal that it needs more processing to fully understand how to handle it. Reparse points started to be treated as fs.ModeIrregular in CL 460595 (submitted by @bcmills), landed in go1.21.

Unfortunately, it did not mark all reparse points as irregular, only those that are not identified with ModeSymlink | ModeDir | ModeNamedPipe | ModeDevice | ModeCharDevice. This limits the usefulness of fs.ModeIrregular, as it is not possible to know, for example, if a directory is a reparse point or not. One can currently infer that all fs.ModeSymlink are reparse points, which currently includes IO_REPARSE_TAG_MOUNT_POINT, the most popular directory reparse point, but that's not a good reason to also properly support less common directory reparse ponts, such as IO_REPARSE_TAG_LX_SYMLINK or vendors defined reparse points.

I propose to treat all reparse points as fs.ModeIrregular, and to modify its definition from non-regular file; nothing else is known about this file to non-regular file; may need special handling.

Worth noting that this will help identifying mounted directories (aka IO_REPARSE_TAG_MOUNT_POINT) as reparse points if we manage to not treat them as symbolic links, that is fixing this TODO.

Edit: I'm opening this issue to discuss potential concerns.

@golang/windows

Change https://go.dev/cl/517815 mentions this issue: os: treat all non-symlink reparse points as ModeIrregular on Windows

Hmm. Lstat never follows reparse points, right? Can that be used to determine whether an arbitrary path is a reparse point?

Or does Lstat report, say, ModeDir for reparse points that are also directory-like?

Hmm. Lstat never follows reparse points, right? Can that be used to determine whether an arbitrary path is a reparse point?

Or does Lstat report, say, ModeDir for reparse points that are also directory-like?

Lstat reports fs.ModeDir for directory-like non-symlink reparse points. Unfortunately it doesn't currently set fs.ModeIrregular on that case. That's why I submitted this proposal.

Hmm. I do think it makes sense to mark directories as irregular if they may require additional special handling, although I also think it makes sense to continue to omit the irregular bit for reparse points that are β€œjust” symlinks, since the os package already does the special handling needed for those.

I gave this another thought, and now I think I've given the wrong solution to the issue of consistently identifing directory reparse points.

IMO we shouldn't be treating any reparse points as a directory, trying to follow them while recursively walking directories is a recipe for problems. In fact, we almost have it right now: symlinks and mount points never have the directory bit set thanks to CL 41830. If we do the same for all reparse points, then the current logic will already mark them as irregular files.

require additional special handling, although I also think it makes sense to continue to omit the irregular bit for reparse points that are β€œjust” symlinks, since the os package already does the special handling needed for those.

Good point, agree.

Change https://go.dev/cl/536655 mentions this issue: os,path/filepath: treat all non-symlink reparse points as irregular files

Is it correct to say that the above change would start causing junctions to stop showing up as symlink'd directories, and start showing up as irregular files?

Given real symlinks are locked behind Developer Mode without elevation, loads of tools use junctions to emulate symlinks (e.g. yarn, pnpm in the JS ecosystem). I'm a little worried that some tooling out there will stop working properly.

Right now, WSL sees junctions as symlinks (when observed via /mnt/c or something). I know the WSL FS works via 9p; given the name/protocol I had kinda assumed it was in Go but maybe not. git-bash also treats junctions as symlinked dirs on Windows (though curiously git itself thinks they are regular directories, something I had tried to change in git-for-windows/git#4383 as it totally breaks recursive symlinks, unfortunately with the justification that Go treated them like symlinks.... πŸ™ƒ ).

Is it correct to say that the above change would start causing junctions to stop showing up as symlink'd directories, and start showing up as irregular files?

Junctions are currently only identified as symlinks, not directories. This prevents the infinite recursion trap for users blindly walking directories. If CL 536655 is approved, then junctions won't be identified as symlinks, but as irregular files.

I'm planning to submit a companion proposal to make query the file reparse tag easier , so users can take informed decisions when implementing different FS operations. Something like this: #63429 (comment).

rsc commented

Per discussion on #63429, even if we adopt a general rule that all reparse points are Irregular, we should also add an exception for dedup points, which are (1) regular enough, and (2) common enough to be worth special casing.

rsc commented

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
β€” rsc for the proposal review group

Just to clarify, for a MOUNT_POINT reparse point:

  • os.Lstat would return a FileInfo for the reparse point itself, with the ModeIrregular bit set and the ModeDir bit unset. (Before this proposal, it would instead be reported as ModeSymlink and not ModeDir.)

  • Because IO_REPARSE_TAG_MOUNT_POINT has the surrogate bit set, os.Stat would return a FileInfo for the underlying directory joined at the mount point, with the ModeDir bit set and the ModeIrregular bit unset. (No change in behavior from today.)

rsc commented

What is the current proposal and is there general agreement for it yet?

rsc commented

Waiting here for the specific proposal. I think we agreed that dedup reparse points will not count as irregular too.

Here comes the specific proposal, affecting os.Lstat and os.Stat:

  • Don't set the fs.ModeSymlink type bit for IO_REPARSE_TAG_MOUNT_POINT reparse points.
  • Set the fs.ModeIrregular bit for all reparse points, except for IO_REPARSE_TAG_SYMLINK and IO_REPARSE_TAG_DEDUP.
  • For files with the fs.ModeIrregular bit set, don't set any other type bit (e.g. fs.ModeDir).
  • Add a GODEBUG setting (winreparsepoint?) that changes the behavior of os.Lstat and os.Stat to restore the old behavior.

This proposal makes two breaking changes:

  • Mount points will no longer be identified as symbolic links, which seems like the right thing to do in the general case.
  • Reparse points pointing to directories won't have the fs.ModeDir set anymore (this was already the case for mount points). This is a safety measure to avoid infinite recursion in directory traversal, as discussed before.
rsc commented

Probably the GODEBUG should seem like something that can be set to true or false. "winreparsepoint" doesn't say what happens to the reparse points. What about winreparseirregular=1 for the new behavior?

rsc commented

Assuming we use winreparseirregular as the GODEBUG, have all remaining concerns been addressed?

@rsc I was rereading #63703 and realized that @bcmills proposed to use the same debug flag, winsymlink, both for #63703 and for this proposal:

At new Go releases (winsymlink=1):

os.Lstat would only report ModeSymlink for IO_REPARSE_TAG_SYMLINK. All other reparse tags β€” including mount points β€” would be reported as either ModeIrregular or regular files (for DEDUP reparse points in particular) per #61893 and #63429.

Using a single GODEBUG is fine with me, as these two proposals are very related.

rsc commented

Based on the discussion above, this proposal seems like a likely accept.
β€” rsc for the proposal review group

The proposal is detailed in #61893 (comment).

rsc commented

No change in consensus, so accepted. πŸŽ‰
This issue now tracks the work of implementing the proposal.
β€” rsc for the proposal review group

The proposal is detailed in #61893 (comment).

Change https://go.dev/cl/565136 mentions this issue: os: don't treat mount points as symbolic links

I want to clarify this point of the proposal, which might lead to confusion:

  • Reparse points pointing to directories won't have the fs.ModeDir set anymore (this was already the case for mount points).

By Reparse points pointing to directories I meant surrogate reparse points. If a reparse point is not a surrogate, then it doesn't reference a directory, it is a directory, and therefore should be reported as such by os.Lstat. Symlinks and mount points are surrogate reparse points, so this confusion doesn't affect them.