proposal: x/pkgsite: new source meta tag
jba opened this issue · 12 comments
(Edited: now a proposal.)
This issue describes a new HTML meta tag for referring to Go source files in online documentation. It is an official Go proposal, though it doesn't affect the Go language or tools.
For many years, the go-source meta tag has allowed godoc.org and other source-browsing tools to provide links to Go source for import paths that use the go-import meta tag. With the advent of modules, the go-source meta tag in its current form cannot be used, because it does not support versions. While we could just extend go-source to add a parameter to the templates, we could also take this opportunity to improve it in other ways.
We propose a new tag, go-source-v2, with the following properties:
- Module versions are supported.
- Source files are described relative to modules rather than packages.
- Additional information can be provided, so that module-browsing tools like pkg.go.dev can display repo information and render README and similar files.
- The anomalies listed at the end of the current spec are resolved.
Structure
For certain common code-hosting sites, like GitHub and Bitbucket, no go-source-v2 tag is necessary. See "Implicit Source Information" below for details.
For a module with path M, the tag should appear in the <head> of the page served by
GETting https://M?go-get=1. The tag should look like
<meta name="go-source-v2" content="home directory file line-suffix raw">
where:
- home is the URL of the repo root. If
_, then the repo root (third component) of thego-importtag on the same page is used; if there is nogo-importtag or the tag’s second component ismod, then no repo is specified. This does not preclude serving source files but it does prevent tools from linking to the repo or providing repo-based signals, like number of stars and forks. - directory is a URL template for linking to a directory of files. It supports two parameters:
{revision}is replaced by an identifier for the (approximate) VCS revision. See "Revision Parameters" below for more.{dir}is replaced by the directory relative to the module (not repo) root.
- file is a URL template for linking to an entire file. In addition to
{revision}and{dir}, it also supports{file}, the basename of the file. - line-suffix will be appended to file to obtain a URL for a file at a particular line. It supports only the parameter
{line}, the 1-based integer line number. - raw is a URL template for linking to the raw contents of a file. It supports the
{revision},{dir}and{file}parameters as defined above. While file should display a file for people (with line numbers and syntax highlighting, perhaps), raw should serve the raw bytes of the file. It can be used to rewrite links in README files and the like.
After a tool replaces a template’s parameters, it should remove doubled and trailing slashes. This should make go-source’s {/dir} parameter unnecessary. In theory, a site could serve a path differently depending on whether it had a trailing slash, but we are unaware of any code-hosting site that makes this distinction.
Any component of the tag’s contents can be omitted by using an underscore.
Here’s an example of the go-import and go-source-v2 meta tags for the gopkg.in/yaml.v2 module:
<meta name="go-import" content="gopkg.in/yaml.v2 git https://gopkg.in/yaml.v2">
<meta name="go-source-v2" content="
github.com/go-yaml/yaml
https://github.com/go-yaml/yaml/tree/{revision}/{dir}
https://github.com/go-yaml/yaml/blob/{revision}/{dir}/{file}
#L{line}
https://github.com/go-yaml/yaml/raw/{revision}/{dir}/{file}
">
Revision Parameters
Tools should derive {revision} from the module version as follows:
- For pseudo-versions, use the commit hash (the part after the final hyphen).
- For semantic versions, use the version after removing any
+incompatiblesuffix.
Use other version specifiers (likemaster) as is.
For a nested module, {revision} is not actually the tag name. A nested module N at version v1.2.3 has tag N/v1.2.3, but {revision} will be v1.2.3. The templates must account for this. For instance, if example.com served directories using GitHub-style URLs, and example.com/mod/nest were a nested module under example.com/mod, then its directory template might be https://example.com/mod/tree/nest/{revision}/nest/{dir}. The first occurrence of nest is part of the tag that identifies the version of example.com/mod/nest.
Implicit Source Information
If the https://M?go-get=1 page for module M has a go-import meta tag that refers to a repo whose domain matches one of the following glob patterns, then no go-source-v2 tag is needed:
- github.com
- bitbucket.org
- *.googlesource.com
- gitlab.com
- gitlab.* (if the site behaves like gitlab.com)
The templates for these sites are well-known, and are provided below.
There is one problem: for a major version greater than 1, the templates for “major branch” and “major subdirectory” conventions differ (See https://research.swtch.com/vgo-module for a discussion of these conventions.) To determine the right template, make a HEAD request for the go.mod file using each template, and select the one that succeeds. For example, for module github.com/a/b/v2 at version v2.3.4, probe both github.com/a/b/blob/v2.3.4/go.mod (the location of the go.mod file using the “major branch” convention) and github.com/a/b/blob/v2.3.4/v2/go.mod (its location using “major subdirectory”).
Standard Patterns
In these patterns, REPO is the repo URL and MS is the suffix of the module path without the repo prefix. These can be determined from the go-import tag and the path of the go-get=1 URL.
github.com:
- directory: REPO/tree/{revision}/MS/{dir}
- file: REPO/blob/{revision}/MS/{dir}/{file}
- line suffix: #L{line}
- raw: REPO/raw/{revision}/MS/{dir}/{file}
gitlab.com, gitlab.*:
- directory: REPO/tree/{revision}/MS/{dir}
- file: REPO/blob/{revision}/MS/{dir}/{file}
- line suffix: #L{line}
- raw: REPO/raw/{revision}/MS/{dir}/{file}
bitbucket.org:
- directory: REPO/src/{revision}/MS/{dir}
- file: REPO/src/{revision}/{dir}/MS/{file}
- line suffix: #lines-{line}
- raw: REPO/raw/{revision}/MS/{dir}/{file}
*.googlesource.com:
- directory: REPO/+/{revision}/{dir}
- file: REPO/+/{revision}/{dir}/{file}
- line suffix: #{line}
- raw: not supported
Sites that won’t work
Code-hosting sites running Gitea cannot be accommodated by the source linking scheme described above, or indeed by any scheme that has only the information available from the module zip. Gitea source URLs are different for branches, tags and commit hashes, and for the last only the full hash will work. Since revisions should always be tags, the templates for a Gitea site can use the tag form of the source URL. But there is no template that will work with the abbreviated hash at the end of a pseudo-version.
While a source-browsing tool could clone the repo and resolve the abbreviated hash locally, that work should be outside the scope of the tool. Instead, we suggest that a gitea.com contributor add URL routes that can work with partial hashes.
The same problem exists for code.dumpstack.io (which appears to be a rebranded gitea).
Whatever software is used for https://blitiri.com.ar/ has the same issues, and one additional one: there doesn’t seem to be any URL for tags.
How would it affect go get?
You're right that it wouldn't need to affect go get's code per se, since it primarily uses the go-import meta tag to locate the VCS, not go-source.
The base documentation for remote import paths, Go meta tags, and how to obtain them via ?go-get=1 is also documented under cmd/go: https://golang.org/cmd/go/#hdr-Remote_import_paths
So I think my original wording wasn't right, but I still think this is very relevant to the folks who work on modules such as @bcmills or @jayconrod. What package to file this under would depend on where this would all be documented, I assume.
I also do think we should make this a proposal, since it seems like a pretty big decision to make without the process :)
Agreed this won't affect the go command.
The revision parameters tricky, but this already addresses the problems that come to mind (nested modules and pseudo-versions), and it seems like this will work for known major providers.
OK @mvdan, as per https://github.com/golang/proposal#readme I added the Proposal label and edited the initial comment to match.
What package to file this under would depend on where this would all be documented, I assume.
The current convention is documented on the gddo repo's wiki. I don't know if the doc needs to live anywhere more prominent or central than that (with s/gddo/pkgsite/ of course).
Copy of #40477 (comment):
We identified a few important problems with the go-source-v2 idea, most notably:
- The only guarantee for a module is a zip file. There may not even be a publicly browseable source code repository available.
- The extra meta tag is only accessible if you go back to the original redirect page. It's not proxyable like all the other module metadata, nor is it preserved if the origin goes away. (Part of why it's not proxyable is that it's only a godoc.org add-on, not something the Go toolchain has ever defined.)
- The rules around {revision} are very specific to Git repos and bake in details about pseudo-versions that may change in the future.
This makes us pretty confident that go-source-v2 as proposed in the other issue is not the right long-term solution. It's a little bit more module-friendly in that it knows what a version is, but it's not module-friendly enough. More thought is clearly needed.
And even if we added that go-source-v2 support today, we'd need every go get redirector to be updated before any links would start working. That's a lot to ask for a design that we're not even sure is right, especially when a better design might not require any changes at all. The right answer might be to display the code directly from the zip files, or it might be to put some info in the zip file that helps find a source display, or it might be something else entirely. We don't know.
For now, instead of defining a new tag that will require widespread adoption but still not be completely right, it seems best to get the most common sites working by making changes to pkg.go.dev directly, and then revisit the topic when we've had more time to think about the right path forward.
Gitea now has support for partial commit hashes in version 1.14.0 when it is released.
Change https://golang.org/cl/274956 mentions this issue: internal/source: update gitea comment
We changed the "known sites" approach to recognize specific URL schemas in the old version-free go-source tags and automatically adjust them to the versioned equivalents. That approach requires only O(number of hosting softwares) instead of O(number of hosting domains) cases and should scale better.
That change, along with our willingness to add custom patterns to the pkg.go.dev source as necessary to match new sites (see #40477), makes this proposal unnecessary. I retract it.
This proposal has been declined as retracted.
— rsc for the proposal review group