pote/gpm

Git Updating: Branches vs. Tags/Commits

technosophos opened this issue · 10 comments

I've been struggling with using GPM for trees that I want to keep updated to the latest commit on a particular branch.

Use case:

I want a repository to stay on the latest checkout on master, and each time I run gpm install I want it to update to the latest commit

If my Godeps file looks like this:

github.com/Masterminds/cookoo

Then master is checked out initially, and the tree is updated each time gpm install is run, but the repo is always pointed to whatever commit I got when I initially ran gpm install.

Same thing happens if I do this:

github.com/Masterminds/cookoo

The relevant line in GPM is this:

https://github.com/pote/gpm/blob/master/bin/gpm#L63

I had the same problem a while ago with gpm-git, and ended up doing this:

      # Figure out, based on ref, what type of reference this is.
      local vtype="commit"
      git show-ref -q "origin/$version" && vtype="branch"
      git show-ref -q "tags/$version" && vtype="tag"

      echo ">> Setting $package to $vtype $version"
      cd $install_path
      [ -d .git ] && git checkout -q "$version"

      # Handle case where branch changed. We need to get to the tip
      # of that branch.
      [ $vtype == "branch" ] && git merge --ff-only origin $version

It's far more verbose, but it seems to do the trick.

If I have the time this week, I will work up a patch and submit a pull request. Feel free, of course, to find a better way of doing it (or to simply say that this is out of scope for GPM).

As always, thanks for a great tool.

pote commented

Mmm, I'm curious: what happens if instead of having simply the import path with no version/tag/branch in Godeps you specify master as the version?

github.com/Masterminds/cookoo master

I remember thinking this over at some point, if we omit the branch name yes, the go get -u -d will update the repository but the git checkout will run without arguments. If we have master in there though it should update the reference to the latest in that branch.

In summary: my thinking was that Explicit is better than implicit so it makes sense to explicitly specify the master branch in the Godeps file. Does doing it this way solve your use case? If not we'll have to find a way to do it.

As always, thanks for a great tool.

As always: thanks for your kind words and all your help in gpm :). ❤️

'master' seems to still have the same result. My second example in the last post was supposed to look like this:

github.com/Masterminds/cookoo master

But it looks like I pasted the same thing twice. Sorry.

AFAIK, calling git checkout master will not fast-forward to the tip of the branch. You have to do a merge --ff-only to get that behavior. (At least, that's what my experiments suggest.)

Can confirm that the issue is real. Having an everlasting master can have unwanted side effects.
My thoughts on this will be doing a gpm-update command to go through the dependencies and update the sha to latest master.

I've been thinking about this and looking at some other tools. It seems that about half use "install" to install or update, while others use "update" or "upgrade" to do a local update. Homebrew is a good example -- it has an install, upgrade and update (the last of which has nothing to do with individual packages).

The other thing I just don't know at all is whether this has the same impact on bzr and hg. I don't really use either of those. If this is just a git problem, we should probably fix it in the install command. But if all of them have it, maybe @elcuervo 's idea is the right one.

pote commented

Mmm, I'm conflicted about this.

Historically the feature of specifying branches in the Godeps file was more of a happy accident than an intended function, this is because git checkout accepts everything from tags and revision hashes (which was what I wanted to use in Godeps) to branches, if you go through the README you'll notice that the possibility of putting branches in there is not even mentioned, that was done un purpose.

The reasoning is: while it's perfectly possible to have branch names as versions for packages this means that the builds stops being fully reproducible at any given point in time, because User A and User B can run the exact same Godeps file at different points in time and get different results, which defeats the purpose of the tool. Sticking to that mode of using it we'd have to add a Godeps.lock file that specifies the actual versions used in a project like bundler does, but considering that we can achieve the same results by sticking to shas/tags I believe that that way lies madness.

I'm unsure on how to proceed based on that line of thinking, I guess the real question is: do we want to
support (and therefore condone) the use of branches in Godeps? I lean towards a "no" myself but I'm open to keep the discussion going, if the answer ends up being "yes" then we'll have to think of a way to solve this use case for all VCSs.

Regarding a possible gpm-update tool that gets all the latest versions for packages and updates the Godeps file: I'm +1 for that as a plugin, it'd probably be very similar to what gpm-bootstrap does.

I agree with your reasoning about stability. If the concern of the developer is a 100% reproducible build every time, commit hashes or tags are clearly the best policy.

But I think branching adds a level of flexibility that most tools don't have. And I like this flexibility.

For some projects (Particularly external dependencies), we often pin packages to specific commits. But there are others where it is more important to us to stay caught up to the latest stable (e.g. whatever is on master) than to stick to a particular version. In some version management tools, this would be equivalent to >2.0.0 or something like that. Branch support gives us a very convenient way to specify fuzzy versions.

Sometimes we even use branches as references during development, and then switch to commit SHAs when we're ready to release. This ensures that we're always working on the latest version up until we're ready to roll, then we can rely on consistent versioning in our build and deployment system.

In all, I'd be willing to go with either way. But I do use (and enjoy) the branch feature.

pote commented

Yeah, that's the thing: the use case is a valid one, so I'd love to support it.

I'm thinking that the proposed extension for the Godeps file format in the last comment in this thread might help though and mean we don't need to support all VCSs in this change. We could specify specially tracked dependencies in a more semantic way like

![gpm-track] <import path> <branch> 

And have a plugin (in this case gpm-track, but whatever) specifically update branch-tracked dependencies. This has a few upsides:

  • Makes the tracking more explicit which is good as people should be aware of these unstable dependencies.
  • Keeps branch support out of core, this is also good as the main goal for gpm is reproducible builds.
  • Allows us to solve the tracking use case without adding code to core or having to worry about bazaar, etc.

It has a downside: it would require another command to update tracked dependencies, but (maybe I'm being too optimistic here) I don't think that's necessarily bad as it increases awareness of the unstable dependencies in whoever is using it.

Does that make sense?

pote commented

Forgot to mention: I think @elcuervo was working on a plugin that might do what we are proposing, which is relevant to the discussion.

Wow, lots of things to digest here. Interesting discussion :)

First of all: I'd rather force explicitness, that is, always requiring a SHA1/tag in Godeps. But not branches (see below.)

Second: don't use git merge --no-ff <sha/tag> nor even git rebase <sha/tag>, use git reset --hard <sha/tag> instead. The reason is that if by any chance you locally update the dependency repository then your changes will be part of the dependency history, but only in your machine. By using reset --hard you are sure you will always be using the version as it should be according to the repository history.

Third: branch feature. It's an interesting use case, in fact looks really useful particularly during development, however I don't think it should belong to gpm's core, but as an extension like gpm-update as it was suggested. My reasoning is that gpm install does only that, installs whatever is specified in Godeps, that's why I think enforcing a SHA1/tag is a plus. But branches... I don't know, I'm not sold. Too many potential issues (two users with different versions, having to keep track of the latest HEAD, and all the others you already mentioned) that will only add complexity to this simple tool.

pote commented

After mulling over this for some time I've decided to not support the branch use case for core, this goes hand-in-hand with setting a convention-based Godeps file extensibility strategy as per discussed in #42 - this strategy could help write a plugin to keep certain packages on their bleeding-edge versions.

While that plugin doesn't exist though it should be pretty simple to work around it like so:

$ go get -u ./... && gpm install

This should update all packages to their latest versions and then run gpm to set desired packages to their respective stable versions, if we omit (or comment with a #[gpm-track] or whatever directive) the packages we want to keep changing then we should be good to go, the plugin for this could be as simple as that, too.

Thanks everyone for the input and for the interesting discussion! ❤️