cmd/go: support easy way to install a remote binary while respecting remote 'replace' and 'exclude' directives
thepudds opened this issue · 11 comments
What version of Go are you using (go version
)?
$ go version go version go1.12.1 linux/amd64
Does this issue reproduce with the latest release?
Yes, including tip.
What did you do?
cd /tmp
go get foo
where:
foo
is a module withreplace
directives chosen by the module author, andfoo
is a binary command, and- I want
foo
to build as specified by the module author
What did you expect to see?
The replace
directives in the go.mod
for foo
respected in my resulting foo
binary.
What did you see instead?
The replace
directives in the go.mod
for foo
are currently ignored.
Summary
The suggestion in this issue is to provide some easy way to install a remote binary while respecting replace
and exclude
directives in the remote module.
This seems necessary based on observed usage, as well as this seems part of the promise from the proposal and elsewhere that "a module author is in complete control of that module's build when it is the main program being built". That promise of control by the module author seems especially important when the author is publishing a binary for use by others.
Background
As far I as I was able to follow the design discussions:
- By design, modules are a less expressive system than a more traditional approach, and
- It seems the complete control given to the top-level build via
replace
andexclude
directives was important aspect of balancing out that reduction in expressiveness.
For example, the "Build Control" section in the official proposal has a discussion about this, including (with emphasis added):
Minimal version selection gives the top-level module in the build additional control, allowing it to exclude specific module versions or replace others with different code, but those exclusions and replacements only apply when found in the top-level module, not when the module is a dependency in a larger build.
A module author is therefore in complete control of that module's build when it is the main program being built, but not in complete control of other users' builds that depend on the module. I believe this distinction will make this proposal scale to much larger, more distributed code bases than the Bundler/Cargo/Dep approach.
There are similar sentiments expressed in the initial vgo blog series, such as:
Minimal version selection is very simple. It achieves simplicity by
eliminating all flexibility about what the answer must be: the build
list is exactly the versions specified in the requirements. A real system
needs more flexibility, for example the ability to exclude certain module
versions or replace others.
Those arguments seem reasonably compelling. It seems, however, the current Go modules system does not quite deliver on that promise when, for example:
- The author of module
foo
decided to use areplace
orexclude
, and - The module
foo
is "the main program being built" via ago get foo
executed outside another module.
In that scenario, the current modules system ignores the remote replace
or exclude
in foo
. In other words, that scenario seems to illustrate not delivering on the promise of "A module author is therefore in complete control of that module's build when it is the main program being built".
"Don't use replace" as an alternative solution
When the concern in this issue has been raised in the past, sometimes the response has been something like "People shouldn't really use replace
when releasing a module". However, I think that falls short as a solution.
@rogpeppe has stated for example that juju
cannot currently be built without replace
directives.
In general, one could imagine that the need for replace
directives going down over time as the ecosystem adapts modules and semver more faithfully, but it is hard to imagine the need for need for replace
directives going so low that the need for replace
could be approximated by zero. For example:
- Use of semver is never perfect, especially in the face of human error.
- Changes can break a consumer without changing the statically checkable API.
v0
is a "compatibility free zone", yet people in practice and by necessity still usev0
dependencies. MVS can deliver incompatible results in the face of multiplerequire
directives for av0
dependency.- Etc.
I think the on-going need for replace
is especially true given the purposeful reduction in expressivity elsewhere in the modules system.
"git clone" as an alternative solution
In discussions on this topic, a response is sometimes made along the lines of "If an author of a binary needs to use replace
directives, they can always just update their readme to ask users to not do go get
or go install
and instead do a git clone
followed by go install
". A readme in theory could also include git clone --branch
. However, I think a git clone
solution falls short given:
- The future benefits of GOPROXY mirrors and notaries, etc.
- The benefits of the
go
command automatically picking a good semver tag for you, with a default of@latest
(e.g., the semver-aware logic described in "Module aware go get"). - The need to update the readme over time (e.g., changing the recommended semver tag if the readme supplies a specific recommended semver tag to use with
git
, or changing the readme if it normally specifies a more standardgo get
variation but then is only temporarily updated to specify usinggit clone
in order to respect areplace
directive for a few months while waiting for resolution of an upstream problem, etc.). go get some/cool/cmd
has proven to be popular within the Go community, including as a concrete "gateway" to Go for people who are not yet developing Go themselves.go get
nicely hides most VCS differences.
Hugo is an example of an early modules adopter that currently has a replace
directive in its go.mod
. It has the following installation instructions:
Since Hugo 0.48, Hugo uses the Go Modules support built into Go 1.11 to build. The easiest is to clone Hugo in a directory outside of GOPATH, as in the following example:
mkdir $HOME/src cd $HOME/src git clone https://github.com/gohugoio/hugo.git cd hugo go install
That might be the right choice for Hugo based on its needs and the current state of modules in Go 1.12, but those instructions for example ignore the benefit of semver tags.
Personally, I would view it as a step backwards if something along those lines became the recommended solution if you have a replace
directive.
Other possible solutions
There are likely many possible solutions, but here are three sample solutions to help start conversation here.
Under all three of these options, it could be reasonable reject any filesystem-based replace directives in a remote go.mod that reach outside the module (as suggested by Bryan in #24250 (comment)). To help avoid placing a testing burden on authors to check for that, go release
could warn if that condition is found, which is likely a good thing to do anyway as suggested in the go release
issue in #26420 (comment).
Option 1
If #30515 is adopted with a new -b
flag (for binary or bare), and go get -b foo
ends up meaning "install foo while ignoring any current module's go.mod
and without altering the current module's go.mod
", it could be natural for go get -b foo
to also respect any replace
or exclude
directives in foo
. The rationale could be that there is no other top-level module being considered aside from foo
when -b
is specified.
The same could apply if a different spelling than -b
is selected for #30515 (e.g., perhaps #30515 is resolved via a go get -global
, go get -clone
, go install
, etc.).
Option 2
If #30515 is not adopted, then go get foo
when run outside of a module could be redefined in 1.13 to respect any replace
or exclude
directives in foo
. The rationale could be that there is no other top-level module being considered aside from foo
when doing go get foo
outside of a module.
This behavior was suggested by multiple people for Go 1.12 (including by Bryan in #24250 (comment)), but the ultimate Go 1.12 behavior was different, including Bryan commented in CL 148517 that his 1.12 change was "as minimal a change as I could comfortably make to enable 'go get' outside of a module for 1.12". (Part of the context for Bryan's comment in the CL is I think that minimal change might have been implemented after the 1.12 freeze).
Option 3
The behavior in Option 1 and Option 2 could both be the 1.13 behavior. In other words:
go get foo
when run outside of a module in 1.13 would now respect anyreplace
orexclude
directives infoo
(in addition to respecting the remote module'srequire
directives and not changing any localgo.mod
).go get -b foo
in general means "run as if in a directory outside any module". (Under this definition, it would implyreplace
orexclude
directives in a remotefoo
are respected).
Relationship to other issues
This has been discussed in multiple issue such as #27643, #24250, #30515 and others, but usually as a side topic. Perhaps this issue here can be closed as a duplicate of #30515, but @mvdan, the author of #30515, asked that this aspect be discussed in a different issue than #30515, so hence this issue is being opened now.
I'm still not convinced that long-term use of replace
— other than by developers during development of their own modules — is actually something we should encourage.
(Note that we do support it, in the sense that it remains possible to check out the repository and run go build
within its working tree, but dedicating a go
subcommand or flag to it would significantly change the perception around when and how replace
directives should be used.)
In particular, the compatibility argument for using replace
seems off too me: if someone is building your module as the only thing passed to go get
, then they get exactly the module versions that you've selected — so even if there are breaking changes after that point, why are they relevant?
So I would like some more detail: @rogpeppe, why does juju
require replace
directives today, and what is preventing you from migrating away from them over the long term?
So I would like some more detail: @rogpeppe, why does juju require replace directives today, and what is preventing you from migrating away from them over the long term?
OK, here's some detail on the replacements used by Juju currently. I speak at some remove, because I haven't been directly involved in the Juju code base for a little while now, but I have been historically familiar with some of the issues.
Summary
It might be possible to move away from replace statements in the long term, but it's not necessarily possible in practice without hard-forking multiple external repositories, which has its own on-going cost. It's certainly not going to happen in the short to medium term, which means that replace statements are going to be a necessity for building juju for a while to come.
And even the need for these particular replace statements is dropped eventually, another similar issue might well arise in the meantime. The open source world is not always clean. Some kind of replace statement is a crucial band-aid at times, and one which anyone installing the juju commands, not just developers, will need to apply.
github.com/altoros/gosigma => github.com/juju/gosigma
Fixes races in tests and code. https://github.com/Altoros/gosigma/pull/1/files.
No new or changed API.
There's been a PR open since 2017-05-23 without response.
Making a hard fork of this probably wouldn't be that hard to do, although might
be politically awkward, as github.om/altoros/gosigma
is the "official" API client.
gopkg.in/yaml.v2 => github.com/juju/yaml
A tweak to the accepted floating point syntax. No new API. A small but
significant tweak - the two repositories disagree on whether the a
member in:
a: 123456e1
holds a floating point number or a string. Changing this behaviour might
break backward compatibility with existing YAML files that are held in
immutable databases online.
Changing this unilaterally inside juju could break other code. For
example, code might return a yaml.MapSlice
which is intended to be
marshaled as an order-preserving map, but would instead be marshaled as
an array because its type is not recognized.
Changing this across the entire code base would be hard, as many external
repositories have dependencies on this package.
gopkg.in/mgo.v2 => github.com/juju/mgo
19 commits. Some new API.
The parent repository is no longer maintained, and was not generally
accepting of patches even when it was. The main fixes here are to the
mgo/txn code, which is heavily used by Juju and has been the source of
some major operational issues. The patches address some of these issues,
amongst other things.
Changing this dependency to use github.com/juju/mgo throughout
would be a major piece of work. It is a dependency of many external
repositories which don't themselves care about the fixes that Juju
needs (because they don't use mgo/txn); changing the import path in
those repositories would require a major version change, and subsequent
cascading changes. Persuading the owners of all these repositories that
they should switch to Juju's fork of mgo for fixes that don't directly
affect them would require significant politics, if it could be achieved
at all.
Hi @bcmills, briefly cherry picking a couple points you made:
In particular, the compatibility argument for using replace seems off too me: if someone is building your module as the only thing passed to
go get
, then they get exactly the module versions that you've selected
...unless of course part of the way you selected the module versions as the binary author is via a replace
or exclude
directive, in which case your consumers don't get the module versions that you've selected if your consumers do a go get
under the current system.
so even if there are breaking changes after that point, why are they relevant?
That is not my concern -- modules provide 100% reproducible builds, etc.
The compatibility argument is not about future breaking changes causing problems for consumers, and is more about incompatibilities that exist when a binary is being published given for example (a) semver is not always perfectly followed in practice and (b) people use v0 dependencies from the "compatibility free zone". Either of those mean MVS by itself can select problematic versions of foo
's dependencies that the human author of foo
can resolve with replace
prior to publishing foo
. As Russ wrote, "A real system needs more flexibility, for example the ability to exclude certain module versions or replace others". It seems to me incompatibilities are things that arise in a real system in ways that MVS (by design) cannot always resolve by itself.
unless of course part of the way you selected the module versions as the binary author is via a
replace
orexclude
directive
That's a bit of a circular argument, though. 🙂
The central question here, as I see it, is: should replace
directives be a tool for selecting module versions, or only for relocating them? (I can see a clear use-case for relocation: namely, local development of dependencies. The use-case for selection seems less clear-cut, because it partially overlaps with the usual version selection by MVS.)
@rogpeppe, thanks for the detail.
It seems to me that since gosigma
and mgo
are effectively unmaintained, they should probably be hard-forked (if they cannot be transferred to more active maintainers). That makes replace
(and especially #26904) an important migration tool, but a temporary one.
Specifically in relation to mgo
, you note:
changing the import path in those repositories would require a major version change, and subsequent cascading changes.
Could you give some more detail about why changing the import path would require a major version change? (Do packages pass through mgo
types in their exported APIs in a way that cannot be transitioned using type aliases, interfaces, or a mix of the two?)
The issue with yaml
seems very unfortunate, but if the two packages fundamentally disagree on the meaning of certain YAML files, then it seems even more important that dependencies outside your project receive the version of the package they were developed against, in order to consume their own YAML files (if any) correctly.
That is, I suspect you'll want to maintain a fork of the package for compatibility with existing YAML files, but you'll want to use that fork only for the parts of the code that consume existing files.
On the other hand, if you replacement really is benign for your dependencies, then perhaps the yaml
package should have an explicit, supported hook (a runtime function call, or perhaps a build tag) that you could use to configure the mainline package. (Have you discussed that possibility with the go-yaml
maintainers, presumably @niemeyer?)
Could you give some more detail about why changing the import path would require a major version change? (Do packages pass through mgo types in their exported APIs in a way that cannot be transitioned using type aliases, interfaces, or a mix of the two?)
Yes. For example see this package. The forked mgo package can't type-alias the original Collection
type (because then it can't define its own methods on it); neither can the mgorootkeystore
package change the type it's using to an interface.
Changing the mgorootkeystore
package to use a different version of mgo would definitely be an API-breaking change.
I agree that replacements are not desirable and can usually be worked around in time (with negotiation with upstream maintainers, hard forks etc), but the fact remains that they are important at the time for making working products.
Even if the replacement is just a "temporary" fix, it can be an important one, and I consider it important that it's possible to publish a predictably-versioned Go binary with the fix in place until the replacement can be removed. Telling the user to do their own git clone (working out the correct incantation for that is non-obvious with vanity import paths) or similar is not a good solution.
I think Russ's words, aptly quoted above, bear repeating once again:
Minimal version selection is very simple. It achieves simplicity by eliminating all flexibility about what the answer must be: the build list is exactly the versions specified in the requirements. A real system needs more flexibility, for example the ability to exclude certain module versions or replace others.
Please, let's provide that flexibility for all users of Go modules.
BTW I would be perfectly happy if file-path replacements yielded an error in this case. I do consider file-path replacements suitable for developers only (and they're not a fundamental part of VCS, which replacements and exclusions in general are).
This has an interesting interaction with #33848.
#33848 proposes to use the main module's vendor
directory automatically when it exists, in much the same way that we use the main module's replace
and exclude
directives. However, the main module's vendor
directory — much like filesystem targets of replace
directives — may include go.mod
files that cause it to be pruned out from the module's .zip
file.
Also like a replace
directive, the vendor
directory may include local edits and patches, although I would not recommend patching the vendor
directory in that way.
Perhaps #26904 provides a better way forward: one option we're considering there (from among many alternatives) is adding some new keyword to the go.mod
file, which would remap or alias the import paths themselves rather than their source code.
Those paths are necessarily always import paths, not filesystem paths, which obviates the concern about treating filesystem paths and module paths differently.
Unfortunately, applying only a subset of the main module's directives still poses a testing problem: it would still mean that the module could be built — and would need to be tested in — a third configuration that is neither the same as the view from inside that module, nor from an external module.
The more I think about it, though, the more convinced I am that an “install from semi-inside” command should not be go get
with a flag.
go get
currently serves two purposes: it resolves and adds dependencies, and it installs binaries from within those dependencies. A command to install binaries from within one specific module is meaningfully different: it constrains the module relationship of the arguments in a way that go get
in general does not.
You might even imagine that a build in this mode should not resolve missing imports: if the point is to exactly reproduce what the developer would have built, then the developer necessarily needs to record all of the versions that they intended to use.
You might even imagine that a build in this mode should not resolve missing imports: if the point is to exactly reproduce what the developer would have built, then the developer necessarily needs to record all of the versions that they intended to use.
This seems like a good idea to me.