golang/go

proposal: cmd/go: add ignore directive to go.mod to drop files from module

jniedrauer opened this issue · 16 comments

x/mod hardcodes the max size of a module, zipped or extracted, at 500M. There is no (sane) way to exclude files from a module. This means that if you have a mixed repo with vendored assets and go source code with a go.mod at the root of the repo, you are effectively restricted to a max repo size of 500M.

#37697 suggests respecting the .gitignore file for x/tools/gopls, however this would not be a good solution to the go module max size restriction.

A solution to both problems would be to add an ignore section to the go.mod file. For example:

ignore (
    assets
)

(As a temporary workaround to the hardcoded module size, you can create an empty go.mod file in any subdirectory you want to exclude, which causes it to be flagged as a submodule and excluded from the module zip.)

mvdan commented

Have you seen #29987? It seems like dropping a go.mod in the subdirectory would exclude that entire directory sub-tree from the module's archive.

CC @jayconrod @bcmills

@mvdan I have. I mentioned this workaround. I consider it a less than ideal hack though. Adding an empty go.mod to non-go source code to exclude it is the kind of behavior that should not be relied on. You also pay a cost when go traverses the entire directory tree. It doesn't exclude directories per se, it excludes individual files that share a prefix with a go.mod file, which means that it does a comparison against every single file. For a sufficiently large number of files, there is a non-trivial performance cost.

mvdan commented

My bad - I misread part of your original issue.

I still think that an extra go.mod is a reasonable solution, though. If performance is the main worry here, I'm sure we could look at optimizing how the go tool handles this case. I'm generally in favor of exhasuting all options (including optimizing current features/APIs) before adding new features.

In most cases I'd agree. But this functionality is needed for a few different tools now (x/tools/gopls and x/mod, at least), and I think the complexity of an explicit configuration option is lower than relying on an unintended side effect of x/mod. Putting an empty go.mod file in with source code that's not related to go is definitely not a pretty or intuitive solution.

@jniedrauer, the use of go.mod files to defined module boundaries is not an “unintended side effect”. It is an intentional part of the design.

If it is “not a pretty […] solution”, well, that's unfortunate — but we don't generally change Go tooling (or add redundant features) for purely aesthetic reasons.

If it is “not [an] intuitive solution”, that may be a matter of improving documentation and/or user education. (I think it's too early to tell one way or another, given that large projects especially have only been adopting modules for a year or so, and in small projects this sort of repository division is not necessary.)

cc @matloob

I somewhat agree that adding a go.mod file in a subdirectory is not an intuitive solution. It reminds me a bit of using a tools.go file to track tool dependencies (#25922).

It's not clear to me that an ignore directive is the right solution though. As a user, I'd rather have a marker file (.goignore or something) in the directory to be ignored to avoid cluttering the go.mod file of a large repository. Having a marker like that makes it obvious why a directory is or is not included without having to look at another location. This is the same idea as adding a go.mod file though, just with a different name.

Any mechanism that can be used to exclude (non-go source) directory trees as the intended behavior will solve the problem.

I do think an empty go.mod is a bit of a hack. The presence of a go.mod is supposed to communicate "this is a Go module" but in this case it actually means "this is not a Go module". If a file requires a comment to explain why it exists, then it's not ergonomic. But if it's documented and compatibility is maintained in future versions, then it works for me.

I'm in favor of putting it in go.mod because it's better from a discoverability standpoint. Magic files like dockerignore have bitten me many times in the past. They are not immediately obvious, especially if they start with a . and are hidden from initial view. A central source of truth (go.mod) is desirable. There's also not really a precedent for magic files within the go tooling (with the exception of tools.go as noted above)

Wondering if this ignore concept could be extended to monorepos with vendor/ dirs that are not go code.

I would love to be able to add something in my go.mod so it knows to always run in mod mode.

I would settle for:

ignore (
    assets
    vendor
)
rsc commented

Even if we wanted to do this, I don't see any viable path forward.
All versions of Go have to agree about which files go into a module,
so that they will all agree about the checksum in go.sum and in the checksum database.

There are already two ways to do this: a dummy go.mod or putting
your assets and Go code in sibling subdirectories of the repo root.
Both of those are feasible today, in contrast to adding an ignore directive.

rsc commented

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc Don't we have the Go version directive at the beginning of the go.mod file just for this purpose? So that e.g. the proposed directive is only supported for go.mod files tagged with a sufficiently recent Go version? Or perhaps I've misunderstood what the point of this mechanism is.

@clausecker Unfortunately, that's not sufficient. Suppose a developer uses Go 1.15 and fetches a module with go 1.16. Today, that works (or at least, it can work). The author of the module may use 1.16 features like embed, as long as they're guarded by release tags (like // +build go1.16), and there's some alternative for dependents using 1.15 and earlier. Under this proposal, it would not be safe for Go 1.15 to fetch a module with go 1.16, since the contents might have changed in a way Go 1.15 can't know about.

rsc commented

Based on the discussion above, this proposal seems like a likely decline.
— rsc for the proposal review group

@clausecker, varying the module's contents based on the go directive was proposed in #30369, but it didn't have a clean migration path (and wasn't worth a messy one).

rsc commented

No change in consensus, so declined.
— rsc for the proposal review group

Now that this proposal has been rejected, is there any movement to create a .goignore file? I am only interested in this because of problems with my IDE: #37697

cc @matloob

I somewhat agree that adding a go.mod file in a subdirectory is not an intuitive solution. It reminds me a bit of using a tools.go file to track tool dependencies (#25922).

It's not clear to me that an ignore directive is the right solution though. As a user, I'd rather have a marker file (.goignore or something) in the directory to be ignored to avoid cluttering the go.mod file of a large repository. Having a marker like that makes it obvious why a directory is or is not included without having to look at another location. This is the same idea as adding a go.mod file though, just with a different name.