[meta] Rename to awesome-data, merge @datopian/awesome-data and more
rufuspollock opened this issue · 11 comments
We want to merge @datopian/awesome-data into this repo and rename this to awesome-data
Note @
is just a prefix to designation an org on github.
Acceptance
Desired result:
- @datasets/awesome-data exists - as relocated version of @datasets/registry so we keep ⭐ and issue tracker)
- It has 191 ⭐
- It has all issues (open and closed) from @datasets/registry with their labels
- It has all (open) issues in @datopian/awesome-data (without their labels)
- Contents of repo (files) are from @datopian/awesome-data
- @datasets/core-datasets exists with the repo contents of current @datopian/awesome-data and ⭐ 37
- @datopian/awesome-data no longer exists (it won't as it will have moved!)
- @datopian/awesome-data redirects to @datasets/core-datasets
- datahub.io/collections still working and now pulling from @datasets/awesome-data
- Edit links work @svetozarstojkovic
Tasks
Preliminaries
- @rufuspollock Grant whomever is doing this admin rights in both locations @datasets and @datopian
- Check how we reconfigure datahub to use new repo locations (see below)
- Can we move issues in bulk? (10m research)
- Github has function to move 1 by 1 but no bulk ...
- Worst case we do one by one
- Clone the repos locally so we have a back-up (plus needed for later)
The move
- Move @datasets/registry to @datasets/awesome-data
- Move @datopian/awesome-data to @datopian/core-datasets (in prep for move to @datasets otherwise we get name collisions) and then move to to @datasets/core-datasets (i.e. move org)
Moving issues:
- Take screen shots of issue list (3 pages) with current issue labels
- Move all the open issues from @datasets/core-datasets to @datasets/awesome-data
- Don't worry about labels for now as not systematically used we can add back "later"
Updating repo contents - we want to swap over repo contents now (as easier than swapping issues!)
- Force push contents of old @datasets/registry to @datasets/core-datasets
- Force push contents of old @datopian/awesome-data to @datasets/awesome-data
Afterwards
- Reconfigure datahub to run off new location @svetozarstojkovic
- where is this configured?
- Change it
- Deploy
- Close this issue @svetozarstojkovic
Context
We have two issue trackers / repos serving similar purposes
- datasets/registry (132 / 78)
- awesome-data (50 / 5)
awesome-data/issues and datasets/registry/issues are pretty much identical in function and purpose.
Originally dataset/registry was "core datasets". Over time, we've added ideas for interesting topics or datasets even if not core. Core datasets will continue to be a major focus.
Repo contents is a little different:
- datasets/registry: files and scripts for publishing core datasets to the DataHub.io (and tracking that)
- awesome-data: topics / collections items which are then publishing on datahub.io/collections
Plan would be to move datasets/registry to datasets/awesome-data and merge in @datopian/awesome-data (why this way round? more ⭐ on registry plus more contents)
@rufuspollock
In the acceptance criteria (item 2):
@datasets/core-datasets exists with the repo contents of current @datasets/registry and ⭐️ 39
There is no @datasets/registry
currently existent and it does not have ⭐️ 39. So did you mean @datopian/awesome-data
?
And thus the acceptance criteria item 2 becomes:
- @datasets/core-datasets exists with the repo contents of current @datopian/awesome-data and ⭐️ 37
Analysis
repo | exists | stars |
---|---|---|
datasets/awesome-data | No | 191 |
datasets/registry | Yes | 191 |
datasets/core-datasets | No | 39 |
datopian/awesome-data | Yes | 37 |
Research: Move issues in bulk
Bulk move of issues is not possible in GitHub, found 2 tools (nr.1 best):
-
https://github.com/ahadik/git_mover
Merging repositories. If you want to combine issues from multiple repositories into a single one, this tool does its best to handle name clashes where they matter. It'll even keep assignees on issues if that user if found on the source and destination repo.
For more detailed info see: http://www.alexhadik.com/blog/2016/5/26/migrating-github-repositories-with-gitmover
@rufuspollock For the move from @datopian/awesome-data
to @datopian/core-datasets
to @datasets/core-datasets
, I selected the following for Team Access rights, is that ok?:
@rufuspollock
In the acceptance criteria (item 2):@datasets/core-datasets exists with the repo contents of current @datasets/registry and ⭐️ 39
There is no
@datasets/registry
currently existent and it does not have ⭐️ 39. So did you mean@datopian/awesome-data
?And thus the acceptance criteria item 2 becomes:
- @datasets/core-datasets exists with the repo contents of current @datopian/awesome-data and ⭐️ 37
Analysis
repo exists stars
datasets/awesome-data No 191
datasets/registry Yes 191
datasets/core-datasets No 39
datopian/awesome-data Yes 37
It must be so, so I've gone ahead and implemented the change. Will already update this in acceptance criteria as well
Situation
In acceptance criteria it says:
It has all (open) issues in @datopian/awesome-data (without their labels)
In tasks it says:
Move all the open issues from @datasets/core-datasets to @datasets/awesome-data
Don't worry about labels for now as not systematically used we can add back "later"
##Question
Would you like to have labels there or do you specifically don't want labels there?
Hypothesis
Yes, move labels. I think because it was assumed labels should not be moved because it was a lot of work.
@svetozarstojkovic Do you know where to configure where datahub.io/collections
pulls data from?
It should be reconfigured so that it now pulls from datasets/awesome-data
instead of datopian/awesome-data
(it redirects at the moment so it still works but should probably change it).
@glgoose this item is ticked
Contents of repo (files) are from @datopian/awesome-data
As is
@datasets/core-datasets exists with the repo contents of current @datopian/awesome-data
But this is not the case afaict. the idea was to force push over current repo with the relevant contents. have you done that?
@glgoose I've now fixed this by force pushing backup clones of these 2 old repo (specifically made in case something went wrong like this!)
@svetozarstojkovic i've also fixed the code on datahub.io frontend - can you get this redeployed as asap as pages are 404'ing atm https://datahub.io/collections/bibliographic-data
@rufuspollock
Done, please check it out.
FIXED. All done.