underworldcode/stripy

[REVIEW] Branching and Versioning

Closed this issue · 5 comments

Part of openjournals/joss-reviews#1410

Having a dedicated branch for the submitted paper to JOSS is a good idea because current developments won't interfere with the ongoing review, as you pointed out on the joss-review issue.

Nevertheless I see a few minor issues regarding this approach.
On the first place, the default branch (master branch in this case) is like the presentation card of your repo.
It's not common that a user must change branch in order to get a specific version of the software or get some important information about it (like the paper for instance).
Secondly, I noticed two issues regarding your versioning:

  1. The v0.7.0b release points to a35c04e on master branch, what is not in
    agreement with the paper branch that you request for review.
    Besides, the releases are usually generated from commits on the default branch
    (master). A release generated from a non-default branch is not
    something a user would expect.
  2. The submitted 0.7.0 release does not exists on the repo. A v0.7.0b does, but
    I understand that the submitted version must be a valid GitHub release.

I can think an easy solution to this issues. I don't want to interfere with your git
workflow, so take this instructions just as proposals.

  1. Create a dev branch from the current master, where you can continue your
    development if you don't want to interfere with the review.
  2. Merge the paper branch into master.
  3. Create a new release (v0.7.1 for example) with the latest commit on the new
    master branch.
  4. Then, you could ask @VivianePons to change the version of the submission in order to be
    in agreement with the one I'll be reviewing.

Please, feel free to disagree if you don't like this approach.
We can discuss this issues and create better solutions if you feel it's needed.
Also, I'm glad to help you if you have any doubts when applying these solutions.

OK - so the idea of the paper branch was that, once this process is complete, we would like to have a binder-addressable branch with a dockerfile that people can use to reproduce / rerun the paper for the relevant release of the code. We thought it would also be useful in the review process but probably overlooked the release requirements of joss in setting that up.

This is slightly different from the idea of tagging and releasing an immutable snapshot. I have a similar workflow in a data-interpretation paper in which there is a binder version of the workflow available in a persistent branch that reflects exactly the version we published. The master branch will also include updates to the model as new data are received and re-processed.

Regardless of that motivation, the workflow you propose is also fine. Any branches we make for binder or a successor product do not have to interfere with the joss workflow and we can do that sort of thing later. At the moment, everything is in sync across both branches.

The only thing that remains to fix is the question of the release ... and there I am confused since we have already merged a couple of your suggestions into the master branch ! Any requests ?

Thanks for enlighten me on your git workflow.
I think that because git is a very powerful tool it open the possibilities for very different workflows.
Nevertheless there's one absolute true behind it: the best repo management strategy is the one that best fits your needs.
I'm seeing that the persistent branch approach is not very well suited for your needs.
I think this approach is like using branches as a versioning tool, e.g. having a dedicated branch for binder and another one for the reviewed paper is using branches for pointing to different versions of the same repository.

Branches are not well fitted to serve this purposes, while creating tags as immutable snapshots is a great tool for packaging, change logging, bug tracking and debugging.
This is the most common way to maintain git repositories, so most online tools are built to expect this kind of workflow.
For example, if you want to create a binder based on a specific release of your repo you can do it by specifying it. So for a repo with a single master branch you could have one binder version for each release plus a binder for the latest commit of the branch.

I don't have so much experience with Docker, but from what I read you can do something similar with it.
You can create one Docker image for each release, and even images for non-tagged commits with its hash as the Docker tag.

The only thing that remains to fix is the question of the release ... and there I am confused since we have already merged a couple of your suggestions into the master branch ! Any requests ?

I've seen that you added the LICENCE file to the master branch instead of adding it to the paper branch. So I'm a little bit confused how your master and paper branch differ.

On these cases I always try to keep everything simpler. I don't see that the master branch contains significant changes in comparison with the paper branch, so one easy solution is to merge the paper branch into the master branch. Then you could delete the paper branch and continue working on master.

So, the simplest workflow is to have a single master branch and upload new branches only for opening Pull Requests (or for storing experimental code).
After the paper is accepted you could create a new release, which will be an immutable version of the repo, i.e. the one that has been reviewed on JOSS.

For now, I would not worry about releasing, but I would try to order the branches before applying more modifications.
If you want I can make a PR for this merge, so GitHub can tell us in advance if there's any conflict between them.

I am in the middle of doing that. I will get rid of the paper branch ASAP. Binder doesn't document their API very well for people like me but I dug through their examples more carefully and discovered that we can put the required dockerfile in a subdirectory. This was why I originally had a separate branch: the need to have a DOCKERFILE at the root level (ugly and incompatible with other parts of the workflow). So, having fixed that, I can implement all of the review suggestions here. I will do so, test and then close.

Binder now runs from the master branch and I have created a dev branch for any work we do towards the next release while the JOSS review is under weigh.

Great! 👍
Thanks for taking my proposals into account, hope they make the repo management easier.

Just want to note that having several files on the root is very common (see verde, simpeg or yellowbrick). So don't be afraid if it looks ugly.
In fact, I think it's better that every portion of the repo (code, web services, continuous integration) work the simpler as they could rather than having files on non regular locations.
Although, this Dockerfile configuration is working, so we should leave it as it is.