golang/go

cmd/go: module graph pruning and lazy module loading

bcmills opened this issue ยท 160 comments

I propose to change cmd/go to avoid loading transitive module dependencies
that have no observable effect on the packages to be built.

The key insights that lead to this approach are:

  1. If no package in a given dependency module is ever (even transitively)
    imported by any package loaded by an invocation of the go command, then an
    incompatibility between any package in that dependency and any other package
    has no observable effect in the resulting program(s). Therefore, we can
    safely ignore the (transitive) requirements of any module that does not
    contribute any package to the build.

  2. We can use the explicit requirements of the main module as a coarse filter
    on the set of modules relevant to the main module and to previous
    invocations of the go command.

Based on those insights, I propose to change the go command to retain more
transitive dependencies in go.mod files and to avoid loading go.mod files
for โ€œirrelevantโ€ modules, while still maintaining high reproducibility for build
and test operations.

Design doc: http://golang.org/design/36460-lazy-module-loading.

Change https://golang.org/cl/217557 mentions this issue: cmd/go: add a regression test for package import cycles guarded by build tags

Change https://golang.org/cl/220080 mentions this issue: design/36460: add design for lazy module loading

A detailed design is in CL 220080.

(You can view the rendered Markdown by opening the .md file at the latest patch set and clicking the gitiles link.)

For modules that are tidy:

The module versions recorded in the go.mod file would be exactly those listed in vendor/modules.txt, if present.

๐Ÿ‘ I think that is a nice property to have. I have some projects for which go mod vendor is sane, but go mod tidy updates the go.mod file in undesirable ways due to test-of-test dependencies. If I understand this proposal correctly that problem would go away.

I added a few examples (in our usual cmd/go script test format) to the design doc.

@ChrisHines, the third example explicitly covers the behavior for test-of-test dependencies.

CC @julieqiu since this may affect how we display dependencies on pkg.go.dev.

CC @heschik, @dmitshur because this may affect the package search algorithms used by goimports and godoc.

CC @bep because I seem to recall that Hugo is doing something unusual with module dependencies independent of Go package imports. (But note that we are unlikely to actively support the use of Go modules for purposes other than compiling Go source code.)

bep commented

But note that we are unlikely to actively support the use of Go modules for purposes other than compiling Go source code.

Our unusual usage is fairly well anchored in the "Go project". That said, I have read your design document twice, and I think this is a good thing (even for our unusual use) -- but it would be easier to determine with a prototype (I find this topic to be a little bit of a mind twister and I don't fully understand the examples in the document).

@bcmills This is excellent, thank you.

I think I follow the detail, but have a couple of questions which might well prove otherwise ๐Ÿ˜„

Since the all pattern is based on package imports (more-or-less independent of module dependencies), this change should be independent of the go version specified in the go.mod file.

I'm not sure what this sentence means. In the context of changing the definition of the all package pattern, does this not imply the change to the definition is contingent on the go version specified in the go.mod file? Indeed, doesn't the next paragraph go on to say exactly that, in addition to saying the cmd/go version is also relevant?

Again related to the proposed change in the definition of the all package pattern, given the proposed changes to maintenance via go mod tidy of the main module's go.mod file, does this mean that go test all (assuming the previous, i.e. current, definition of all for one second) is no longer reproducible in the case of incomplete go.mod files? Do we need to retain a special pattern for the previous definition of package all? I'm not sure what the main driver was for the original all definition so can't really offer much other than vague questions here.

What about go.mod files that are left intentionally untidy? Are there any situation where this is still required/desirable? If so, how does that influence (or not influence) the proposed (behaviour) changes?

Regarding the transition to use the new behaviour. People will, per your proposal, need to specify go 1.15 in their go.mod files. I'm not currently aware of any side effects of making that change (from an earlier version), but perhaps that would be worth confirming via an FAQ?

Thinking about the compatibility section and CI use, if I've understood things correctly, running go mod tidy with go1.13.x on a go.mod file that was otherwise stable with go1.15.x will result in changes, but running with go1.14.x will not. I guess that's another potential FAQ entry for people who will likely add go1.15beta1 to their build matrix and then start seeing surprising diffs post the go mod tidy and go mod verify steps of their CI build?

In the context of changing the definition of the all package pattern, does this not imply the change to the definition is contingent on the go version specified in the go.mod file?

Not really, no: the all package pattern (both before and after this proposal) is a function of the import graph, and only interacts with modules (and go.mod files) to the extent that those modules provide imported packages.

But not that the all module pattern does depend on the version, and for a go 1.15 module, the all module pattern as defined in Go 1.14 does not necessarily cover the all package pattern as defined in Go 1.14.

What about go.mod files that are left intentionally untidy?

Don't do that? ๐Ÿ˜…

Are there any situation where this is still required/desirable? If so, how does that influence (or not influence) the proposed (behaviour) changes?

I would argue that there weren't any situations where that was ever required (or particularly desirable). But if you have specific examples, I'd like to look into them in more depth.

People will, per your proposal, need to specify go 1.15 in their go.mod files. I'm not currently aware of any side effects of making that change (from an earlier version), but perhaps that would be worth confirming via an FAQ?

Currently the only other effect on module-mode semantics is that the transition from go 1.13 or lower to go 1.14 or higher also enables automatic use of -mod=vendor when the main module has a vendor subtree. Beyond that, the new language features that come along with the version are all strict additions so far.

Thinking about the compatibility section and CI use, if I've understood things correctly, running go mod tidy with go1.13.x on a go.mod file that was otherwise stable with go1.15.x will result in changes, but running with go1.14.x will not.

go mod tidy with any older version, including 1.14, will demolish the go 1.15 invariants.

The difference between 1.13 and 1.14 is that 1.14 will not demolish those invariants unless you invoke go mod tidy explicitly.

Note that we have fixed bugs in go mod tidy in nearly every release so far, so I hope that by now folks have realized that go mod tidy-ness is version-dependent, much like go fmt cleanness. (And note that go fmt with an older Go version may choke on new language features, so it, too, is not something you want to run with older releases in CI.)

I would argue that there weren't any situations where that was ever required (or particularly desirable). But if you have specific examples, I'd like to look into them in more depth.

This was the comment I had in mind when asking:

https://go-review.googlesource.com/c/tools/+/184018/2#message-7612508883ea3ee1f11a807241422e9b4e138da6

which refers through to #27899. I've not followed that in detail, so just raising for completeness; the point may now be moot.

Currently the only other effect on module-mode semantics is that the transition from go 1.13 or lower to go 1.14 or higher also enables automatic use of -mod=vendor when the main module has a vendor subtree.

Very much a tangent, but are these changes documented somewhere centrally? That would be an incredibly useful guide to start and maintain (vs having the details scattered in various parts of the docs).

This was the comment I had in mind when asking: โ€ฆ

Ah, now I remember. The untidiness there was โ€œlarger than tidyโ€ rather than โ€œsmaller than tidyโ€, and I think we have basically settled on the // +build approach for that use-case.

are these [go version] changes documented somewhere centrally?

Not yet. That's arguably part of #30791...

rsc commented

This is a fantastic design + design doc, thanks so much for the attention to detail.

When I was designing the original load, I neglected to remember that "software engineering is what happens to programming when you add time and other programmers". Specifically I forgot the time axis on package versions and missed that you might need to load the history for consistent results. I'm really thrilled that this design handles the time axis efficiently (by never needing to look into history) while still providing strong consistency and repeatability guarantees.

It would be good to get this into Go 1.15, of course. I've had this on the review minutes for two weeks now, and the reaction here seems almost entirely positive.

Does anyone have any objections to moving forward with this proposal?

Change https://golang.org/cl/222245 mentions this issue: 36460-lazy-module-loading: relax redundant 'go.mod' loads and add another example

Change https://golang.org/cl/222341 mentions this issue: cmd/go: add baseline test cases for non-lazy module loading

Change https://golang.org/cl/222340 mentions this issue: cmd/go/internal/modload: factor smaller files out of load.go and init.go

Change https://golang.org/cl/222340 mentions this issue: cmd/go/internal/modload: factor smaller files out of load.go and init.go

rsc commented

Based on the discussion above, this seems like a likely accept.

rsc commented

No change in consensus, so accepted.

Change https://golang.org/cl/225619 mentions this issue: content/static/doc: add module documentation on communication with proxies

bep commented

@bcmills It would be good with a heads-up for when the master branch have something that can be tested.

rasky commented

Letโ€™s say Iโ€™m writing an application that is made by a single main package. My main package uses package B1 in module B.

Module B also contains a completely unrelated package B2 (for instance a main package under a cmd directory) that is never imported by B1. B2 imports C2 in module C.

It is my understanding that, after this proposal is implemented, I should not see anymore module C in my go.mod. Is that correct? Will this also work if Bโ€™s go.mod is marked with version 1.14?

@rasky

It is my understanding that, after this proposal is implemented, I should not see anymore module C in my go.mod. Is that correct?

Correct.

However, you will still see it in go list -m all โ€” and perhaps in your go.sum file โ€” because it is an explicit dependency of a module (B) that does provide an imported package.

(We can't tell whether C2 is a test dependency of B1 without loading imports, so in go list -m โ€” which does not load imports โ€” we have to assume that it may be.)

Will this also work if Bโ€™s go.mod is marked with version 1.14?

Yes, but go list -m all will follow dependencies of your 1.14 dependencies more aggressively.

Unfortunately, due to the coronavirus disruption (particularly the closure of my daughter's school) I don't think I'll be able to land this until 1.16.

(I haven't made much progress on implementation for 1.15, and we don't want to rush a major change like this one and have it land late in the cycle with hard-to-fix or undetected bugs.)

Change https://golang.org/cl/240623 mentions this issue: cmd/go/internal/modload: implement the "all" pattern for lazy loading

Change https://golang.org/cl/240622 mentions this issue: cmd/go/internal/modload: add a "v" prefix to the indexed go version

Change https://golang.org/cl/240505 mentions this issue: cmd/go/internal/modload: refactor load.go to prepare for lazy loading

Change https://golang.org/cl/244078 mentions this issue: cmd/go/internal/modload: cache the Go language version for each module globally

Change https://golang.org/cl/244760 mentions this issue: cmd/go/internal/mvs: factor out an incremental implementation

Change https://golang.org/cl/244759 mentions this issue: cmd/go/internal/mvs: reverse the order of BuildListError.stack

Change https://golang.org/cl/244774 mentions this issue: cmd/go/internal/modload: cache parsed go.mod files globally

Change https://golang.org/cl/244773 mentions this issue: cmd/go/internal/modload: drop requirements on excluded versions

Change https://golang.org/cl/247764 mentions this issue: cmd/go/internal/mvs: export a NewBuildListError function

Change https://golang.org/cl/247766 mentions this issue: cmd/go/internal/par: add Queue as a simpler alternative to Work

This issue is currently labeled as early-in-cycle for Go 1.16.
That time is now, so this is a friendly ping so the issue is looked at again.

Change https://golang.org/cl/249878 mentions this issue: cmd/go/internal/modload: rename LoadBuildList and BuildList

Change https://golang.org/cl/249877 mentions this issue: cmd/go/internal/modget: move MVS code to a separate file

Change https://golang.org/cl/250338 mentions this issue: cmd/go/internal/modload: address issues missed in CL 244774

Change https://golang.org/cl/251445 mentions this issue: cmd/go/internal/modload: rework import resolution

Change https://golang.org/cl/251997 mentions this issue: cmd/go: simplify and document lazy-loading test cases

Change https://golang.org/cl/251998 mentions this issue: cmd/go/internal/modload: refactor pathInModuleCache

Change https://golang.org/cl/254820 mentions this issue: cmd/go/internal/modget: factor out functions for argument resolution

Change https://golang.org/cl/255938 mentions this issue: cmd/go/internal/modget: consolidate Load entrypoints

Change https://golang.org/cl/256057 mentions this issue: cmd/go/internal/modload: eliminate QueryPackage

Change https://golang.org/cl/261640 mentions this issue: cmd/go/internal/modload: avoid using the global build list in QueryPattern

Change https://golang.org/cl/263138 mentions this issue: cmd/go/internal/modload: embed PackageOpts in loaderParams

Change https://golang.org/cl/266018 mentions this issue: cmd/go: allow 'go get' to downgrade to replacement-only versions

Change https://golang.org/cl/263267 mentions this issue: cmd/go/internal/modget: resolve paths at the requested versions

Change https://golang.org/cl/266369 mentions this issue: cmd/go: make TestScript/mod_get_patchmod self-contained

Change https://golang.org/cl/270979 mentions this issue: cmd/go/internal/modcmd: eliminate a call to modload.LoadedModules

Change https://golang.org/cl/270980 mentions this issue: cmd/go/internal/modload: remove SetBuildList

Change https://golang.org/cl/271419 mentions this issue: cmd/go/internal/work: avoid modload.Selected in 'go install pkg@version'

Change https://golang.org/cl/265777 mentions this issue: cmd/go/internal/modload: implement lazy loading

Change https://golang.org/cl/271646 mentions this issue: cmd/go/internal/work: remove a redundant call to modload.LoadedModules

Change https://golang.org/cl/271647 mentions this issue: cmd/go/internal/modload: eliminate LoadedModules

Change https://golang.org/cl/272128 mentions this issue: cmd/go/internal/modload: remove a stale comment for EditBuildList

Change https://golang.org/cl/272129 mentions this issue: cmd/go/internal/modload: remove the Reqs function

Unfortunately, this feature didn't make the cut for Go 1.16. I'm going to try to have it ready to merge as soon as the tree opens for 1.17.

tandr commented

The list of big changes for 1.17 is getting quite large. Would Go team consider to cut one in between (1.16-2 or something), or shorten 1.17 cycle to the point so this feature would go in before generics land on a separate release (1.18)?

Change https://golang.org/cl/275172 mentions this issue: cmd/go/internal/modload: rename constants to reflect that lazy loading is not yet implemented

Change https://golang.org/cl/275296 mentions this issue: cmd/go/internal/modload: split EditBuildList into Require and Select

Change https://golang.org/cl/278596 mentions this issue: cmd/go/internal/modload: delete unused *mvsReqs.next method

This is maybe silly to mention but it seems worth it to me; the proposal says

An invocation of the go command should either load the same version of each package as every other invocation since the last edit to the go.mod file, or should edit the go.mod file in a way that causes the next invocation on any subset of the same packages to use the same versions

But recently the tool changed to only modify go.mod via go mod or go get. I presume that will be maintained here, but it might be worth clarifying the doc?

I am sorry if this is a dumb question, but does this change have any effect on binary size? In other worlds, would it also avoid compiling in unused packages from imported modules (parts of code not used in my binary)?

Will this change also affect (reduce) the content of go.sum? Test dependencies of dependencies are polluting go.sum (especially when test dependencies of test dependencies are producing dependency spirals; example).

Edit: the answer that the spiral case of test dependencies will be fixed by this change according to this comment by @myitcv.

@Jacalz, we do not expect this change to have any effect on binary size. (Lazy loading will generally reduce the set of versions recorded in the go.sum file, but not the set of packages โ€” or modules containing packages โ€” used to compile the resulting binary.)

Change https://golang.org/cl/287632 mentions this issue: cmd/go/internal/mvs: don't emit duplicates from Req

Change https://golang.org/cl/287633 mentions this issue: cmd/go/internal/mvs: fix Downgrade to match Algorithm 4

Change https://golang.org/cl/287732 mentions this issue: cmd/go/internal/mvs: clarify and annotate test cases

Change https://golang.org/cl/288535 mentions this issue: cmd/go: add script tests for potential upgrades due to downgrades

Change https://golang.org/cl/289696 mentions this issue: cmd/go: add a script test for artifacts resulting from 'go get -u'

Change https://golang.org/cl/289697 mentions this issue: cmd/go/internal/modload: split resolveCandidates into two methods

Change https://golang.org/cl/290769 mentions this issue: cmd/go/internal/modload: add an unused copy of mvs.Downgrade

Change https://golang.org/cl/290770 mentions this issue: cmd/go/internal/modload: fuse upgrading with downgrading in EditBuildList

Change https://golang.org/cl/290771 mentions this issue: cmd/go/internal/mvs: split Reqs into narrower per-function interfaces

Change https://golang.org/cl/293689 mentions this issue: cmd/go/internal/modload: replace the global buildList with structured requirements

Change https://golang.org/cl/294292 mentions this issue: cmd/go/internal/mvs: add test cases for downgrade interaction with hidden versions

Change https://golang.org/cl/294293 mentions this issue: cmd/go: add a script test corresponding to the downhiddenartifact MVS test

Change https://golang.org/cl/294294 mentions this issue: cmd/go/internal/mvs: prune spurious dependencies in Downgrade

This issue is currently labeled as early-in-cycle for Go 1.17.
That time is now, so a friendly reminder to look at it again.

Change https://golang.org/cl/296609 mentions this issue: cmd/go/internal/modload: make EditBuildList report whether the build list was changed

Change https://golang.org/cl/298350 mentions this issue: cmd/go/internal/modload: refactor checkSums to clarify trimming logic

Change https://golang.org/cl/298651 mentions this issue: cmd/go: reject 'go list -m MOD@patch' when no existing version of MOD is required

Change https://golang.org/cl/300158 mentions this issue: cmd/go: test that 'go mod tidy' retains upgraded indirect dependencies

Change https://golang.org/cl/301371 mentions this issue: cmd/go/internal/modload: in readonly mode, do not read go.mod files missing checksums

Change https://golang.org/cl/301370 mentions this issue: cmd/go/internal/modcmd: in 'go mod tidy', suspend go.mod writes until tidy

Change https://golang.org/cl/303431 mentions this issue: cmd/go/internal/modload: remove go116EnableNarrowAll constant

Change https://golang.org/cl/303869 mentions this issue: cmd/go: attribute direct imports from indirect dependencies to the importing package

Change https://golang.org/cl/304909 mentions this issue: cmd/go/internal/modload: factor out a method to update loader requirements

Change https://golang.org/cl/305009 mentions this issue: cmd/go/internal/modload: track conflicts in versionLimiter

Change https://golang.org/cl/308509 mentions this issue: cmd/go: assume Go 1.16 semantics uniformly for unversioned modules

Change https://golang.org/cl/308516 mentions this issue: cmd/go/internal/modload: avoid loading the full module graph for imports satisfied by lazy roots

Change https://golang.org/cl/308515 mentions this issue: cmd/go/internal/modload: add a dormant depth type

Change https://golang.org/cl/308517 mentions this issue: cmd/go/internal/modload: migrate editBuildList to use a structured requirement graph

Change https://golang.org/cl/308809 mentions this issue: cmd/go: in TestScript, set GOTRACEBACK and use SIGQUIT to terminate hung subprocesses

Change https://golang.org/cl/308810 mentions this issue: cmd/go/internal/modload: actually set the depth field passed to newRequirements