cmd/go: module graph pruning and lazy module loading
bcmills opened this issue ยท 160 comments
I propose to change cmd/go
to avoid loading transitive module dependencies
that have no observable effect on the packages to be built.
The key insights that lead to this approach are:
-
If no package in a given dependency module is ever (even transitively)
imported by any package loaded by an invocation of thego
command, then an
incompatibility between any package in that dependency and any other package
has no observable effect in the resulting program(s). Therefore, we can
safely ignore the (transitive) requirements of any module that does not
contribute any package to the build. -
We can use the explicit requirements of the main module as a coarse filter
on the set of modules relevant to the main module and to previous
invocations of thego
command.
Based on those insights, I propose to change the go
command to retain more
transitive dependencies in go.mod
files and to avoid loading go.mod
files
for โirrelevantโ modules, while still maintaining high reproducibility for build
and test operations.
Design doc: http://golang.org/design/36460-lazy-module-loading.
Change https://golang.org/cl/217557 mentions this issue: cmd/go: add a regression test for package import cycles guarded by build tags
Change https://golang.org/cl/220080 mentions this issue: design/36460: add design for lazy module loading
A detailed design is in CL 220080.
(You can view the rendered Markdown by opening the .md
file at the latest patch set and clicking the gitiles
link.)
For modules that are tidy:
The module versions recorded in the go.mod file would be exactly those listed in vendor/modules.txt, if present.
๐ I think that is a nice property to have. I have some projects for which go mod vendor
is sane, but go mod tidy
updates the go.mod
file in undesirable ways due to test-of-test dependencies. If I understand this proposal correctly that problem would go away.
I added a few examples (in our usual cmd/go
script test format) to the design doc.
@ChrisHines, the third example explicitly covers the behavior for test-of-test dependencies.
CC @julieqiu since this may affect how we display dependencies on pkg.go.dev
.
CC @heschik, @dmitshur because this may affect the package search algorithms used by goimports
and godoc
.
CC @bep because I seem to recall that Hugo is doing something unusual with module dependencies independent of Go package imports. (But note that we are unlikely to actively support the use of Go modules for purposes other than compiling Go source code.)
But note that we are unlikely to actively support the use of Go modules for purposes other than compiling Go source code.
Our unusual usage is fairly well anchored in the "Go project". That said, I have read your design document twice, and I think this is a good thing (even for our unusual use) -- but it would be easier to determine with a prototype (I find this topic to be a little bit of a mind twister and I don't fully understand the examples in the document).
@bcmills This is excellent, thank you.
I think I follow the detail, but have a couple of questions which might well prove otherwise ๐
Since the all pattern is based on package imports (more-or-less independent of module dependencies), this change should be independent of the go version specified in the go.mod file.
I'm not sure what this sentence means. In the context of changing the definition of the all
package pattern, does this not imply the change to the definition is contingent on the go
version specified in the go.mod
file? Indeed, doesn't the next paragraph go on to say exactly that, in addition to saying the cmd/go
version is also relevant?
Again related to the proposed change in the definition of the all
package pattern, given the proposed changes to maintenance via go mod tidy
of the main module's go.mod
file, does this mean that go test all
(assuming the previous, i.e. current, definition of all
for one second) is no longer reproducible in the case of incomplete go.mod
files? Do we need to retain a special pattern for the previous definition of package all
? I'm not sure what the main driver was for the original all
definition so can't really offer much other than vague questions here.
What about go.mod
files that are left intentionally untidy? Are there any situation where this is still required/desirable? If so, how does that influence (or not influence) the proposed (behaviour) changes?
Regarding the transition to use the new behaviour. People will, per your proposal, need to specify go 1.15
in their go.mod
files. I'm not currently aware of any side effects of making that change (from an earlier version), but perhaps that would be worth confirming via an FAQ?
Thinking about the compatibility section and CI use, if I've understood things correctly, running go mod tidy
with go1.13.x on a go.mod
file that was otherwise stable with go1.15.x will result in changes, but running with go1.14.x will not. I guess that's another potential FAQ entry for people who will likely add go1.15beta1 to their build matrix and then start seeing surprising diffs post the go mod tidy
and go mod verify
steps of their CI build?
In the context of changing the definition of the
all
package pattern, does this not imply the change to the definition is contingent on thego
version specified in thego.mod
file?
Not really, no: the all
package pattern (both before and after this proposal) is a function of the import graph, and only interacts with modules (and go.mod
files) to the extent that those modules provide imported packages.
But not that the all
module pattern does depend on the version, and for a go 1.15
module, the all
module pattern as defined in Go 1.14 does not necessarily cover the all
package pattern as defined in Go 1.14.
What about
go.mod
files that are left intentionally untidy?
Don't do that? ๐
Are there any situation where this is still required/desirable? If so, how does that influence (or not influence) the proposed (behaviour) changes?
I would argue that there weren't any situations where that was ever required (or particularly desirable). But if you have specific examples, I'd like to look into them in more depth.
People will, per your proposal, need to specify
go 1.15
in theirgo.mod
files. I'm not currently aware of any side effects of making that change (from an earlier version), but perhaps that would be worth confirming via an FAQ?
Currently the only other effect on module-mode semantics is that the transition from go 1.13
or lower to go 1.14
or higher also enables automatic use of -mod=vendor
when the main module has a vendor
subtree. Beyond that, the new language features that come along with the version are all strict additions so far.
Thinking about the compatibility section and CI use, if I've understood things correctly, running
go mod tidy
with go1.13.x on ago.mod
file that was otherwise stable with go1.15.x will result in changes, but running with go1.14.x will not.
go mod tidy
with any older version, including 1.14, will demolish the go 1.15
invariants.
The difference between 1.13 and 1.14 is that 1.14 will not demolish those invariants unless you invoke go mod tidy
explicitly.
Note that we have fixed bugs in go mod tidy
in nearly every release so far, so I hope that by now folks have realized that go mod tidy
-ness is version-dependent, much like go fmt
cleanness. (And note that go fmt
with an older Go version may choke on new language features, so it, too, is not something you want to run with older releases in CI.)
I would argue that there weren't any situations where that was ever required (or particularly desirable). But if you have specific examples, I'd like to look into them in more depth.
This was the comment I had in mind when asking:
which refers through to #27899. I've not followed that in detail, so just raising for completeness; the point may now be moot.
Currently the only other effect on module-mode semantics is that the transition from
go 1.13
or lower togo 1.14
or higher also enables automatic use of-mod=vendor
when the main module has avendor
subtree.
Very much a tangent, but are these changes documented somewhere centrally? That would be an incredibly useful guide to start and maintain (vs having the details scattered in various parts of the docs).
This was the comment I had in mind when asking: โฆ
Ah, now I remember. The untidiness there was โlarger than tidyโ rather than โsmaller than tidyโ, and I think we have basically settled on the // +build
approach for that use-case.
are these [
go
version] changes documented somewhere centrally?
Not yet. That's arguably part of #30791...
This is a fantastic design + design doc, thanks so much for the attention to detail.
When I was designing the original load, I neglected to remember that "software engineering is what happens to programming when you add time and other programmers". Specifically I forgot the time axis on package versions and missed that you might need to load the history for consistent results. I'm really thrilled that this design handles the time axis efficiently (by never needing to look into history) while still providing strong consistency and repeatability guarantees.
It would be good to get this into Go 1.15, of course. I've had this on the review minutes for two weeks now, and the reaction here seems almost entirely positive.
Does anyone have any objections to moving forward with this proposal?
Change https://golang.org/cl/222245 mentions this issue: 36460-lazy-module-loading: relax redundant 'go.mod' loads and add another example
Change https://golang.org/cl/222341 mentions this issue: cmd/go: add baseline test cases for non-lazy module loading
Change https://golang.org/cl/222340 mentions this issue: cmd/go/internal/modload: factor smaller files out of load.go and init.go
Change https://golang.org/cl/222340 mentions this issue: cmd/go/internal/modload: factor smaller files out of load.go and init.go
Based on the discussion above, this seems like a likely accept.
No change in consensus, so accepted.
Change https://golang.org/cl/225619 mentions this issue: content/static/doc: add module documentation on communication with proxies
@bcmills It would be good with a heads-up for when the master branch have something that can be tested.
Letโs say Iโm writing an application that is made by a single main package. My main package uses package B1 in module B.
Module B also contains a completely unrelated package B2 (for instance a main package under a cmd directory) that is never imported by B1. B2 imports C2 in module C.
It is my understanding that, after this proposal is implemented, I should not see anymore module C in my go.mod. Is that correct? Will this also work if Bโs go.mod is marked with version 1.14?
It is my understanding that, after this proposal is implemented, I should not see anymore module C in my go.mod. Is that correct?
Correct.
However, you will still see it in go list -m all
โ and perhaps in your go.sum
file โ because it is an explicit dependency of a module (B) that does provide an imported package.
(We can't tell whether C2
is a test dependency of B1
without loading imports, so in go list -m
โ which does not load imports โ we have to assume that it may be.)
Will this also work if Bโs go.mod is marked with version 1.14?
Yes, but go list -m all
will follow dependencies of your 1.14 dependencies more aggressively.
Unfortunately, due to the coronavirus disruption (particularly the closure of my daughter's school) I don't think I'll be able to land this until 1.16.
(I haven't made much progress on implementation for 1.15, and we don't want to rush a major change like this one and have it land late in the cycle with hard-to-fix or undetected bugs.)
Change https://golang.org/cl/240623 mentions this issue: cmd/go/internal/modload: implement the "all" pattern for lazy loading
Change https://golang.org/cl/240622 mentions this issue: cmd/go/internal/modload: add a "v" prefix to the indexed go version
Change https://golang.org/cl/240505 mentions this issue: cmd/go/internal/modload: refactor load.go to prepare for lazy loading
Change https://golang.org/cl/244078 mentions this issue: cmd/go/internal/modload: cache the Go language version for each module globally
Change https://golang.org/cl/244760 mentions this issue: cmd/go/internal/mvs: factor out an incremental implementation
Change https://golang.org/cl/244759 mentions this issue: cmd/go/internal/mvs: reverse the order of BuildListError.stack
Change https://golang.org/cl/244774 mentions this issue: cmd/go/internal/modload: cache parsed go.mod files globally
Change https://golang.org/cl/244773 mentions this issue: cmd/go/internal/modload: drop requirements on excluded versions
Change https://golang.org/cl/247764 mentions this issue: cmd/go/internal/mvs: export a NewBuildListError function
Change https://golang.org/cl/247766 mentions this issue: cmd/go/internal/par: add Queue as a simpler alternative to Work
This issue is currently labeled as early-in-cycle for Go 1.16.
That time is now, so this is a friendly ping so the issue is looked at again.
Change https://golang.org/cl/249878 mentions this issue: cmd/go/internal/modload: rename LoadBuildList and BuildList
Change https://golang.org/cl/249877 mentions this issue: cmd/go/internal/modget: move MVS code to a separate file
Change https://golang.org/cl/250338 mentions this issue: cmd/go/internal/modload: address issues missed in CL 244774
Change https://golang.org/cl/251445 mentions this issue: cmd/go/internal/modload: rework import resolution
Change https://golang.org/cl/251997 mentions this issue: cmd/go: simplify and document lazy-loading test cases
Change https://golang.org/cl/251998 mentions this issue: cmd/go/internal/modload: refactor pathInModuleCache
Change https://golang.org/cl/254820 mentions this issue: cmd/go/internal/modget: factor out functions for argument resolution
Change https://golang.org/cl/255938 mentions this issue: cmd/go/internal/modget: consolidate Load entrypoints
Change https://golang.org/cl/256057 mentions this issue: cmd/go/internal/modload: eliminate QueryPackage
Change https://golang.org/cl/261640 mentions this issue: cmd/go/internal/modload: avoid using the global build list in QueryPattern
Change https://golang.org/cl/263138 mentions this issue: cmd/go/internal/modload: embed PackageOpts in loaderParams
Change https://golang.org/cl/266018 mentions this issue: cmd/go: allow 'go get' to downgrade to replacement-only versions
Change https://golang.org/cl/263267 mentions this issue: cmd/go/internal/modget: resolve paths at the requested versions
Change https://golang.org/cl/266369 mentions this issue: cmd/go: make TestScript/mod_get_patchmod self-contained
Change https://golang.org/cl/270979 mentions this issue: cmd/go/internal/modcmd: eliminate a call to modload.LoadedModules
Change https://golang.org/cl/270980 mentions this issue: cmd/go/internal/modload: remove SetBuildList
Change https://golang.org/cl/271419 mentions this issue: cmd/go/internal/work: avoid modload.Selected in 'go install pkg@version'
Change https://golang.org/cl/265777 mentions this issue: cmd/go/internal/modload: implement lazy loading
Change https://golang.org/cl/271646 mentions this issue: cmd/go/internal/work: remove a redundant call to modload.LoadedModules
Change https://golang.org/cl/271647 mentions this issue: cmd/go/internal/modload: eliminate LoadedModules
Change https://golang.org/cl/272128 mentions this issue: cmd/go/internal/modload: remove a stale comment for EditBuildList
Change https://golang.org/cl/272129 mentions this issue: cmd/go/internal/modload: remove the Reqs function
Unfortunately, this feature didn't make the cut for Go 1.16. I'm going to try to have it ready to merge as soon as the tree opens for 1.17.
The list of big changes for 1.17 is getting quite large. Would Go team consider to cut one in between (1.16-2 or something), or shorten 1.17 cycle to the point so this feature would go in before generics land on a separate release (1.18)?
Change https://golang.org/cl/275172 mentions this issue: cmd/go/internal/modload: rename constants to reflect that lazy loading is not yet implemented
Change https://golang.org/cl/275296 mentions this issue: cmd/go/internal/modload: split EditBuildList into Require and Select
Change https://golang.org/cl/278596 mentions this issue: cmd/go/internal/modload: delete unused *mvsReqs.next method
This is maybe silly to mention but it seems worth it to me; the proposal says
An invocation of the go command should either load the same version of each package as every other invocation since the last edit to the go.mod file, or should edit the go.mod file in a way that causes the next invocation on any subset of the same packages to use the same versions
But recently the tool changed to only modify go.mod
via go mod
or go get
. I presume that will be maintained here, but it might be worth clarifying the doc?
I am sorry if this is a dumb question, but does this change have any effect on binary size? In other worlds, would it also avoid compiling in unused packages from imported modules (parts of code not used in my binary)?
Will this change also affect (reduce) the content of go.sum
? Test dependencies of dependencies are polluting go.sum
(especially when test dependencies of test dependencies are producing dependency spirals; example).
Edit: the answer that the spiral case of test dependencies will be fixed by this change according to this comment by @myitcv.
@Jacalz, we do not expect this change to have any effect on binary size. (Lazy loading will generally reduce the set of versions recorded in the go.sum
file, but not the set of packages โ or modules containing packages โ used to compile the resulting binary.)
Change https://golang.org/cl/287632 mentions this issue: cmd/go/internal/mvs: don't emit duplicates from Req
Change https://golang.org/cl/287633 mentions this issue: cmd/go/internal/mvs: fix Downgrade to match Algorithm 4
Change https://golang.org/cl/287732 mentions this issue: cmd/go/internal/mvs: clarify and annotate test cases
Change https://golang.org/cl/288535 mentions this issue: cmd/go: add script tests for potential upgrades due to downgrades
Change https://golang.org/cl/289696 mentions this issue: cmd/go: add a script test for artifacts resulting from 'go get -u'
Change https://golang.org/cl/289697 mentions this issue: cmd/go/internal/modload: split resolveCandidates into two methods
Change https://golang.org/cl/290769 mentions this issue: cmd/go/internal/modload: add an unused copy of mvs.Downgrade
Change https://golang.org/cl/290770 mentions this issue: cmd/go/internal/modload: fuse upgrading with downgrading in EditBuildList
Change https://golang.org/cl/290771 mentions this issue: cmd/go/internal/mvs: split Reqs into narrower per-function interfaces
Change https://golang.org/cl/293689 mentions this issue: cmd/go/internal/modload: replace the global buildList with structured requirements
Change https://golang.org/cl/294292 mentions this issue: cmd/go/internal/mvs: add test cases for downgrade interaction with hidden versions
Change https://golang.org/cl/294293 mentions this issue: cmd/go: add a script test corresponding to the downhiddenartifact MVS test
Change https://golang.org/cl/294294 mentions this issue: cmd/go/internal/mvs: prune spurious dependencies in Downgrade
This issue is currently labeled as early-in-cycle for Go 1.17.
That time is now, so a friendly reminder to look at it again.
Change https://golang.org/cl/296609 mentions this issue: cmd/go/internal/modload: make EditBuildList report whether the build list was changed
Change https://golang.org/cl/298350 mentions this issue: cmd/go/internal/modload: refactor checkSums to clarify trimming logic
Change https://golang.org/cl/298651 mentions this issue: cmd/go: reject 'go list -m MOD@patch' when no existing version of MOD is required
Change https://golang.org/cl/300158 mentions this issue: cmd/go: test that 'go mod tidy' retains upgraded indirect dependencies
Change https://golang.org/cl/301371 mentions this issue: cmd/go/internal/modload: in readonly mode, do not read go.mod files missing checksums
Change https://golang.org/cl/301370 mentions this issue: cmd/go/internal/modcmd: in 'go mod tidy', suspend go.mod writes until tidy
Change https://golang.org/cl/303431 mentions this issue: cmd/go/internal/modload: remove go116EnableNarrowAll constant
Change https://golang.org/cl/303869 mentions this issue: cmd/go: attribute direct imports from indirect dependencies to the importing package
Change https://golang.org/cl/304909 mentions this issue: cmd/go/internal/modload: factor out a method to update loader requirements
Change https://golang.org/cl/305009 mentions this issue: cmd/go/internal/modload: track conflicts in versionLimiter
Change https://golang.org/cl/308509 mentions this issue: cmd/go: assume Go 1.16 semantics uniformly for unversioned modules
Change https://golang.org/cl/308516 mentions this issue: cmd/go/internal/modload: avoid loading the full module graph for imports satisfied by lazy roots
Change https://golang.org/cl/308515 mentions this issue: cmd/go/internal/modload: add a dormant depth type
Change https://golang.org/cl/308517 mentions this issue: cmd/go/internal/modload: migrate editBuildList to use a structured requirement graph
Change https://golang.org/cl/308809 mentions this issue: cmd/go: in TestScript, set GOTRACEBACK and use SIGQUIT to terminate hung subprocesses
Change https://golang.org/cl/308810 mentions this issue: cmd/go/internal/modload: actually set the depth field passed to newRequirements