Support git-archive protocol
fenollp opened this issue · 16 comments
git-archive
is used to download a zip or tarball of the repo at a specific commit.
- Bitbucket does it
- It is a part of Git
The HTTP API is fine, then why not support the git-archive
API?
EDIT: FYI here's how to use the HTTP API
curl --fail --silent --show-error --location \
https://codeload.github.com/<user>/<repo>/tar.gz/<branch or tag but not SHA>
+1 request for this feature; with remote archive protocol it's much more flexible, like this command to retrieve latest localedata from glibc:
➸ git archive --remote git://sourceware.org/git/glibc.git HEAD \
localedata/locales |tar -tvv
vs current github just failed on git archive
➸ git archive --remote git://github.com/gcc-mirror/gcc HEAD localedata/locales
fatal: The remote end hung up unexpectedly
+1000
+1
Anyone knows of an update regarding this issue?
I tried the following without success:
git archive --format=zip --remote git://github.com/<account>/<repo>.git <tag/branch> <file-path> > <file-path>.zip
Getting fatal: The remote end hung up unexpectedly
.
- Tried several version like
github.com:<account>
andgit@github.com
andssh://
but getting similar or other errors.
BTW, clone works:
git clone git@github.com:<account>/<repo>.git
+1 Yes can we please get this?
Updated 1st comment (#554 (comment)) with a workaround.
I agree this would be useful, and wonder why it's not already there.
(And for private repos, the git protocol can use the ssh-based deploy key, which is scoped per repository, and can be made for read-only access (for organizations). It looks like the simple https-based credentialling solutions usually end up giving write access to more than just a repo.)
But maybe the following could be a work-around, for some situations:
git clone --depth 1 --branch <some-tag> git@github.com:<account>/<repo>.git
This would give you the code without all the history except for the last commit of the snapshot you want. That is, the .git folder would not be so big (almost all the objects not there). This has the bonus of still implicitly storing information about the exact "version" of the software you have.
I agree this would be useful, and wonder why it's not already there.
(And for private repos, the git protocol can use the ssh-based deploy key, which is scoped per repository, and can be made for read-only access (for organizations). It looks like the simple https-based credentialling solutions usually end up giving write access to more than just a repo.)But maybe the following could be a work-around, for some situations:
git clone --depth 1 --branch <some-tag> git@github.com:<account>/<repo>.git
This would give you the code without all the history except for the last commit of the snapshot you want. That is, the .git folder would not be so big (almost all the objects not there). This has the bonus of still implicitly storing information about the exact "version" of the software you have.
unfortunately, this solution seemingly doesn't work with commit-id.
Why you dont simply use
https://github.com/<user>/<repo>/archive/<tag-name>.tar.gz
You only have to add tags to your repo and you automatically get a archive download for your repo at this point.
AFAIK git archive would even let you just download a specific subfolder.
I don't have time to check this out right now, but I wonder if any of this (under "SECURITY") is relevant. It is referenced in the "--remote=<repo>" section of the git archive --help
> git-upload-archive --help
GIT-UPLOAD-ARCHIVE(1) Git Manual GIT-UPLOAD-ARCHIVE(1)
NAME
git-upload-archive - Send archive back to git-archive
SYNOPSIS
git upload-archive <directory>
DESCRIPTION
Invoked by git archive --remote and sends a generated archive to the other end over the Git protocol.
This command is usually not invoked directly by the end user. The UI for the protocol is on the git archive side, and the program pair is meant to be used to get an
archive from a remote repository.
SECURITY
In order to protect the privacy of objects that have been removed from history but may not yet have been pruned, git-upload-archive avoids serving archives for
commits and trees that are not reachable from the repository’s refs. However, because calculating object reachability is computationally expensive, git-upload-archive
implements a stricter but easier-to-check set of rules:
1. Clients may request a commit or tree that is pointed to directly by a ref. E.g., git archive --remote=origin v1.0.
2. Clients may request a sub-tree within a commit or tree using the ref:path syntax. E.g., git archive --remote=origin v1.0:Documentation.
3. Clients may not use other sha1 expressions, even if the end result is reachable. E.g., neither a relative commit like master^ nor a literal sha1 like abcd1234 is
allowed, even if the result is reachable from the refs.
Note that rule 3 disallows many cases that do not have any privacy implications. These rules are subject to change in future versions of git, and the server accessed
by git archive --remote may or may not follow these exact rules.
If the config option uploadArchive.allowUnreachable is true, these rules are ignored, and clients may use arbitrary sha1 expressions. This is useful if you do not
care about the privacy of unreachable objects, or if your object database is already publicly available for access via non-smart-http.
OPTIONS
<directory>
The repository to get a tar archive from.
GIT
Part of the git(1) suite
Git 2.28.0 2020-07-28 GIT-UPLOAD-ARCHIVE(1)
That is, I wonder if setting uploadArchive.allowUnreachable to True at the client could make it work? -or maybe that is the/a server setting, which GitHub has set to False?
So maybe try . . .
git config --global --bool --add uploadArchive.allowUnreachable 1
Nope. Not allowed (as the OP probably already knew).
> git archive --remote=git@github.com:copasi/COPASI.git HEAD Tools
Invalid command: 'git-upload-archive 'copasi/COPASI.git''
You appear to be using ssh to clone a git:// URL.
Make sure your core.gitProxy config option and the
GIT_PROXY_COMMAND environment variable are NOT set.
Do I understand correctly that GitHub doesn't provide any way to download a tarball from a private repository using a read-only key? As far as I see GitHub's HTTPS doesn't provide read-only keys and GitHub's Git doesn't provide tarballs.
Wouldn’t it be nice if GitHub was open source and the community itself could submit a PR and implement this.
In a way archive is part of git protocol and many others providers implement it. Yet it’s stuck on a backlog for over 3 years.
For private repos you can use tarball/zipball links:
curl -L https://api.github.com/repos/octocat/Hello-World/zipball/master?access_token=$TOKEN --output hello.tar.zip
issue open in 2016 ?? lol