astropy/astropy-project

rename master branch to main for the core package and coordinated packages

eteq opened this issue ยท 34 comments

eteq commented

The actual implementation might need to be repo-level issues, but I think the agreement on this should come at a Project level, hence why I'm making this issue in this repo.

The default git name master is very problematic language that causes active harm and is anti-inclusive - you can see more about this from git itself or any of a variety of other resources I can link if there's debate about this.

However, assuming others agree with this as a goal, the issue here is to actually work out the mechanics of this. Just changing the name without doing anything else is problematic because it breaks pending PRs, local workflows, etc. I think github and git are working on making this easier (@astrofrog mentioned something about this that perhaps he can link here), but this issue is for working out those details.

GitHub's guidance at https://github.com/github/renaming suggests waiting until later this year as they will provide tools to make it a lot easier to migrate existing repositories. But I agree we should do this once the tools are available.

eteq commented

Re-visiting this topic since we got through "later this year" - looking at https://github.com/github/renaming, it looks like they've implemented what they're going to implement, which is all the internal github things, but of course there's no definitive way to update it in user environments. So there's still a hump there, but I think it's as far as can be done automatically.

So if we were to start rolling this out here's I think what would have to happen:

  1. rename master to main using the new fancy github thingie, which would update PRs and the like.
  2. update all documentation that refernces master and change it to main
  3. Anyone who has already cloned astropy will need to update their local clone's master to point to main. It looks like once we do the rename github may at least have a thing that pops up that gives instructions on how to do this, although I haven't tested any of that.

We should probably try this out in some lower-traffic repo first. I'll sniff around for some options (possibly not in astropy to start with) and report back if I learn anything new.

pllim commented

@eteq , I feel like we need to provide memos to all the usual comm channels before actually implementing it, so no one is taken by surprise; and if they still are surprised, we can say, "lookie here, here, and here"

eteq commented

Definitely, @pllim ! Good point, that should come before we do any of the steps I said above. I think after I've done my investigation I'll draft something for astropy-dev since that might prompt further discussion than this thread.

eteq commented

As an update here: I transitioned several repos that are "institution-owned" following the github process (spacetelescope/jdaviz, spacetelescope/jdat_notebooks, spacetelescope/dat_pyinthesky), and it was relatively painless. To summarize the process:

  1. Notify developers that it's happening
  2. Make a PR updating the README with some instructions for the local changes required (I have some suggested wording below) and any references to master in docs and/or CI files. (approximately just a find-and-replace in the repo)
  3. Go into the github "branch" tab and do the renaming (currently a "pencil"/edit icon, not the two-arrows)
  4. merge the PR and send out any needed notifications of "it's done"
  5. Have @pllim point out you forgot some items in # 2 ๐Ÿ˜‰, and merge that PR
  6. (~1 year later?) remove the README instructions on the assumption that by then

Here's a possible example of README text:

## If you locally cloned this repo before 5 Feb 2021

The primary branch for this repo has been transitioned from ``master`` to ``main``.  If you have a local clone of this repository and want to keep your local branch in sync with this repo, you'll need to do the following in your local clone from your terminal:
```
git branch -m master main
git fetch origin
git branch -u origin/main main
```
If you are using a GUI to manage your repos you'll have to find the equivalent commands as it's different for different programs. Alternatively, you can just delete your local clone and re-clone!
eteq commented

It was pointed out out-of-band by @bsipocz that the fetch above should include --prune. Also that an alternative is to just say only do git fetch --prune origin on the theory that using the local main/master is confusing/bad anyway.

So updated version:

## If you locally cloned this repo before 5 Feb 2021

The primary branch for this repo has been transitioned from ``master`` to ``main``.  If you have a local clone of this repository and want to keep your local branch in sync with this repo, you'll need to do the following in your local clone from your terminal:
```
git fetch --prune origin
# you can stop here if you don't use your local "master"/"main" branch
git branch -m master main
git branch -u origin/main main
```
If you are using a GUI to manage your repos you'll have to find the equivalent commands as it's different for different programs. Alternatively, you can just delete your local clone and re-clone!
pllim commented

FYI -- In the infrastructure/DevOps tag-up today, we have decided to push this forward for the Project, with @eteq leading the effort.

mhvk commented

Nice! Note that numpy already moved. It meant a few PRs broke, but it seems those were abandoned at the level of the user no longer having the repository - see numpy/numpy#18543

I think it's less likely here to run into those PRs, as we bot close the old ones anyway.

The email went out, so this is water under the bridge, but I think the commands above should use astropy rather than origin, assuming people follow our dev guide and name their remotes based on github usernames/org names, etc rather than leaving the default. No big deal, but sticking to our preached best practices would never hurt.

Agree with @bsipocz on the naming (astropy not origin), I had exactly the same thought and went to check the dev guide on how we are recommending to people to name their repos. Personally I use upstream and had thought that upstream is a more standard naming convention for our git workflow, but honestly I don't know. In any case, origin is probably the wrong answer here.

pllim commented

I opened issues on the repos in the org, except for https://github.com/astropy/old-astropy-website-src -- That one does not have "Issues" enabled.

pllim commented

@adrn also mentions https://github.com/dfm/rename-github-default-branch by @dfm that renames the branch programmatically. There is also SO thread https://stackoverflow.com/questions/52776313/set-github-default-branch-through-api-call . Of course, you need the necessary permission on the repos to run this.

mhvk commented

Indeed, I also use upstream and origin for my own - which is handy if you work with many different projects (one less thing to remember!). Anyway, not super important!

Indeed, upstream and origin are the standard ones but can be confusing about which one is which, especially once you start to work with multiple forks. So therefore I think it's a good suggestion that we recommend renaming things so they immediately look the same they are on github (btw, that part of the devdocs helped me a lot at the very beginning to pick up the logic of git). So I stick to this logic for all projects I work, and have shell shortcuts to make it quicker to add remotes.

And to demonstrate how confusing upstream/origin are, my comment above has the mistake of calling origin astropy while is should be the github username. But, there is also another mistake in Erik's commands, namely the fetch should be for upstream rather than origin. Out-of-band though I suggested using git fetch --all --prune, which circumvents the issue altogether :)

eteq commented

Whether upstream vs origin is right depends on how it was cloned - in my experience most users who wouldn't know better (i.e. those who need the instructions) clone from the "upstream" because they may not have a fork. but @bsipocz is absolutely right that --all is a better solution anyway! So the updated version:

It was pointed out out-of-band by @bsipocz that the fetch above should include --prune. Also that an alternative is to just say only do git fetch --prune origin on the theory that using the local main/master is confusing/bad anyway.

So updated markdown version:

## If you locally cloned this repo before 5 Feb 2021

The primary branch for this repo has been transitioned from ``master`` to ``main``.  If you have a local clone of this repository and want to keep your local branch in sync with this repo, you'll need to do the following in your local clone from your terminal:
```
git fetch --all --prune
# you can stop here if you don't use your local "master"/"main" branch
git branch -m master main
git branch -u origin/main main
```
If you are using a GUI to manage your repos you'll have to find the equivalent commands as it's different for different programs. Alternatively, you can just delete your local clone and re-clone!

and an RST version:


If you locally cloned this repo before 10 Mar 2021
--------------------------------------------------

The primary branch for this repo has been transitioned from ``master`` to ``main``.  If you have a local clone of this repository and want to keep your local branch in sync with this repo, you'll need to do the following in your local clone from your terminal::

   git fetch --all --prune
   # you can stop here if you don't use your local "master"/"main" branch
   git branch -m master main
   git branch -u origin/main main

If you are using a GUI to manage your repos you'll have to find the equivalent commands as it's different for different programs. Alternatively, you can just delete your local clone and re-clone!

eteq commented

One nice thing I realized in astropy/astropy-APEs#66 : links to the "old" name still work, they just redirect. So e.g. https://github.com/astropy/astropy-APEs/blob/master/APE8.rst is not broken, but rather now redirects to https://github.com/astropy/astropy-APEs/blob/main/APE8.rst (albiet with a banner that says the name was changed)

Oh, the redirect is great, that means @pllim could just go ahead and merge most of the PRs she opened yesterday as there is no circular dependencies any more :)

I am totally lost when you say that using your local main/master is "bad". It is basically not possible to do astropy development without frequently referring to the main branch. This is the branch point for every single PR that I do, and I constantly do the equivalent of git pull upstream master. So what is our disconnect here?

pllim commented

Anyone knows how long the redirect would work?

pllim commented

go ahead and merge most of the PRs she opened yesterday

๐Ÿ˜… I merged some for the infrastructure stuff. For the rest, would still be nice for someone else to review, just in case.

I am totally lost when you say that using your local main/master is "bad". It is basically not possible to do astropy development without frequently referring to the main branch.

main/master as on the fork. You should refer to the main branch on the "canonical" version from the astropy org, but should not refer to the main on your fork as it's outdated by default, and in fact a bit of a pain to keep up-to-date. And there were multiple cases where it was the source of extreme git pain with rebases (people ended up with doubling commits after a rebase as they referring to the master of a fork rather than the "central" repo, etc.). I say "canonical"/"central" as I have no better word for it, after all git is fully decentralized, we just use it in a way that there is one reference, central version that lives under the astropy org.

Anyone knows how long the redirect would work?

worth asking at the helpdesk, but given that they keep the references to hashes of commits that are not on branches any more, I would not be surprised if this would stay on forever as a side effect of the rename (and if so, this is yet another reason to do it with the rename tool rather than creating a new "main" branch and change it to default).

Slightly OT (I can open a new issue), but doing some googling on "github fork workflow" showed that the top 4 search hits where they specifically named the "canonical" version, that remote was always called upstream. For instance: https://www.atlassian.com/git/tutorials/comparing-workflows/forking-workflow

There is also https://guides.github.com/activities/forking/ from GitHub that calls the canonical version "upstream" (without saying this is a remote name). From my perspective upstream is widely regarded as the name for the canonical version, so can we just use that? This would apply to our dev docs (which is clearly a separate issue).

And indeed numpy uses upstream right away: https://numpy.org/doc/stable/dev/#development-process-summary.

pllim commented

Let's move the discussions about remote naming to astropy/astropy#11383 .

FWIW, personally, I have always used upstream but I also know some people who actively changed it to other names like spacetelescope or astropy because they found upstream too cryptic. ๐Ÿคท

Yes, I much prefer to use spacetelescope or astropy, after all we have seen repos moved between users/organizations, what happens to upstream then?

(think about wcsaxes or reproject that started in a personal repo, then moved to astropy with the original person forking it. If I added it as upstream, that upstream would keep pointing to the personal fork, which is not "canonical" any longer. This happens even more for new experimental repos, e.g. the places I helped out a lot with packaging/CI/etc, and of course basically never affects big libraries. Yes, the advantage of using the word upstream is that you can use the same word in all the documentations everywhere, but in the real workflow it can easily become confusing).

pllim commented

When the canon repo moves around, I go into my local clone and do a git remote set-url upstream <new_url>, then I don't have to remember whether the canon copy now lives at spacetelescope or astropy.

I don't think there is a right or wrong way, just everyone has their own preference.

saimn commented

I've always used upstream as well, it's much simpler as I don't need to think at what is the upstream name, I can just pull from upstream in any fork I have. And if upstream moves then I update its URL, and it's still upstream. Having multiple upstream repos is not something usual. Also upstream is the most widespread recommendation, so recommending something else can be confusing for people that have been contributing to other projects. Or when different projects rename their master branch.

Yeah, but at the end of the day you need to know it when you open the PR ๐Ÿคทโ€โ™€๏ธ

Also, I clearly have no issues when it's personal preference, but think it's worth to know the use cases where the other approach may work better (I really just had to do way too many rebases for others that were due to this origin+upstream+one more fork confusion)

I make sure I always clone from the "upstream" repository and keep it's name as "origin". (Usually, I do that clone before I even have my own fork anyway). I then add my own fork of the project with some canonical name (I use "github" or "bitbucket"). But, I agree, different workflows work, so it's good to be consistent within our documentation. People who use different personal conventions hopefully know enough to know to to modify the instructions for their situation.

pllim commented

To close the loop on how long the master -> main redirect would work, this is the official reply from GitHub:


GitHub has made changes and introduced features to support projects and maintainers that want to rename their default branch. Web requests for the old branch name will now be redirected to the new branch name for as long as such links are used within your projects. You can read more about the feature release here:

https://github.blog/changelog/2020-07-17-links-to-deleted-branches-now-redirect-to-the-default-branch/

pllim commented

I think this is done, right? Did we miss anything?

pllim commented

reproject now has main.