cmd/go/internal/get: VCS path regexp omits characters accepted by CheckImportPath
stub42 opened this issue · 15 comments
What version of Go are you using (go version)?
$ go version go version go1.12.2 linux/amd64
Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (go env)?
go env Output
$ go env GOARCH="amd64" GOBIN="" GOCACHE="/home/stub/.cache/go-build" GOEXE="" GOFLAGS="" GOHOSTARCH="amd64" GOHOSTOS="linux" GOOS="linux" GOPATH="/home/stub/go" GOPROXY="" GORACE="" GOROOT="/snap/go/3540" GOTMPDIR="" GOTOOLDIR="/snap/go/3540/pkg/tool/linux_amd64" GCCGO="gccgo" CC="gcc" CXX="g++" CGO_ENABLED="1" GOMOD="" CGO_CFLAGS="-g -O2" CGO_CPPFLAGS="" CGO_CXXFLAGS="-g -O2" CGO_FFLAGS="-g -O2" CGO_LDFLAGS="-g -O2" PKG_CONFIG="pkg-config" GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build922868437=/tmp/go-build -gno-record-gcc-switches"
What did you do?
Attempted to fetch the git repo containing a go package, publicly available at:
- git://git.launchpad.net/~stub/+git/go-eggs
- git+ssh://git.launchpad.net/~stub/+git/go-eggs
- https://git.launchpad.net/~stub/+git/go-eggs
Per go help importpath, the following should attempt a git download using https://, falling back to git+ssh://
go get -v git.launchpad.net/~stub/+git/go-eggs.git
Attempting to force ssh per git help environment does not help
GIT_ALLOW_PROTOCOL=ssh go get -v git.launchpad.net/~stub/+git/go-eggs.git
What did you expect to see?
go get attempt to clone the git repo at https://git.launchpad.net/~stub/+git/go-eggs, and if that failed, fall back to cloning the git repo at git+ssh://git.launchpad.net/~stub/+git/go-eggs.
Per go help importpath, because the import path has a version control qualifier (.git) then no attempt should be made to attempt to get the import over https: to look for a tag.
What did you see instead?
Go get attempts to parse meta tags from https://, which fails. No attempt is made to clone the git repo.
$ go get -v git.launchpad.net/~stub/+git/go-eggs.git
Fetching https://git.launchpad.net/~stub/+git/go-eggs.git?go-get=1
Parsing meta tags from https://git.launchpad.net/~stub/+git/go-eggs.git?go-get=1 (status code 200)
package git.launchpad.net/~stub/+git/go-eggs.git: unrecognized import path "git.launchpad.net/~stub/+git/go-eggs.git" (parse https://git.launchpad.net/~stub/+git/go-eggs.git?go-get=1: no go-import meta tags ())
The regular expression that looks for the .git suffix is here:
go/src/cmd/go/internal/get/vcs.go
Line 1028 in ccd9d9d
Testing against that (playground) reveals that the problem here is that the + character fails to match the regular expression, which is more restrictive about + and ~ than our validation for import paths in general:
go/src/cmd/go/internal/get/path.go
Lines 125 to 136 in 3cb92fc
This needs a fix, but it's not obvious to me whether the VCS path should be made more permissive, or CheckImportPath should be made more restrictive. I'll follow up with @rsc and @jayconrod.
I'm confused by whether the +git is essential here. The URL already starts and ends with git. Must it also have git in the middle? Somehow we have gotten this far without allowing + in go get paths, and I want to understand why.
When Launchpad code hosting added git support it was necessary to add +git as a path component to avoid namespace clashes with Bazaar branches. Launchpad code hosting is pretty much stuck with it, but yes, + will be very uncommon, unlike other special characters such as ~. The trailing .git is optional, and only in my example because I wanted to skip the https: VCS detection (probably necessary for private repositories, when I get that far).
I checked the new module index for other existing packages that use the + character, but haven't been able to locate any so far.
(That's not to say that they don't exist, but if they do they're rare.)
Perhaps we should disallow + from import paths entirely?
If this is not a breakage introduced after Go 1.12, I'm inclined to leave deciding what to do until after Go 1.13 is out.
@bcmills This issue is just rolling forward through release milestones. Should we just move it to the Backlog milestone? That's what Backlog is for. Thanks.
Unicode characters is #29101. Assuming we exclude that from this discussion, this discussion is only about whether to start using + in paths.
Technically the looser module restrictions mean someone could set up a non-VCS-backed module with + today, but all the VCS-backed stuff can't use it. There's nothing with + in the module index (index.golang.org).
It seems like we should reject + and wait for complaints.
There are no uses we can find.
It doesn't seem to be used anywhere.
If we drop it from pathOK on day 1 of Go 1.16, we will have time to put it back.
The original complaint is that Launchpad git hosting URLs such as https://git.launchpad.net/~stub/+git/go-eggs do not work with 'go get', and now Go modules. It is not used anywhere, because it doesn't work (and never has).
This issue is currently labeled as early-in-cycle for Go 1.16.
That time is now, so this is a friendly ping so the issue is looked at again.
Change https://golang.org/cl/250919 mentions this issue: module: reject "+" in CheckPath and CheckImportPath
This has affected me, but it's not a significant issue. I have a module containing a main package with the name etld+1, which is a tool for processing domain names. I can rename it to something less helpful, but it would be nice if I didn't have to.
Change https://golang.org/cl/282512 mentions this issue: content/static/doc: remove "+" from allowed module path characters