datasets/awesome-data

[meta] Rename to awesome-data, merge @datopian/awesome-data and more

rufuspollock opened this issue · 11 comments

We want to merge @datopian/awesome-data into this repo and rename this to awesome-data

Note @ is just a prefix to designation an org on github.

Acceptance

Desired result:

Tasks

Preliminaries

  • @rufuspollock Grant whomever is doing this admin rights in both locations @datasets and @datopian
  • Check how we reconfigure datahub to use new repo locations (see below)
  • Can we move issues in bulk? (10m research)
    • Github has function to move 1 by 1 but no bulk ...
    • Worst case we do one by one
  • Clone the repos locally so we have a back-up (plus needed for later)

The move

  • Move @datasets/registry to @datasets/awesome-data
  • Move @datopian/awesome-data to @datopian/core-datasets (in prep for move to @datasets otherwise we get name collisions) and then move to to @datasets/core-datasets (i.e. move org)

Moving issues:

  • Take screen shots of issue list (3 pages) with current issue labels
  • Move all the open issues from @datasets/core-datasets to @datasets/awesome-data
    • Don't worry about labels for now as not systematically used we can add back "later"

Updating repo contents - we want to swap over repo contents now (as easier than swapping issues!)

  • Force push contents of old @datasets/registry to @datasets/core-datasets
  • Force push contents of old @datopian/awesome-data to @datasets/awesome-data

Afterwards

Context

We have two issue trackers / repos serving similar purposes

  • datasets/registry (132 / 78)
  • awesome-data (50 / 5)

awesome-data/issues and datasets/registry/issues are pretty much identical in function and purpose.

Originally dataset/registry was "core datasets". Over time, we've added ideas for interesting topics or datasets even if not core. Core datasets will continue to be a major focus.

Repo contents is a little different:

  • datasets/registry: files and scripts for publishing core datasets to the DataHub.io (and tracking that)
  • awesome-data: topics / collections items which are then publishing on datahub.io/collections

Plan would be to move datasets/registry to datasets/awesome-data and merge in @datopian/awesome-data (why this way round? more ⭐ on registry plus more contents)

@rufuspollock
In the acceptance criteria (item 2):

@datasets/core-datasets exists with the repo contents of current @datasets/registry and ⭐️ 39

There is no @datasets/registry currently existent and it does not have ⭐️ 39. So did you mean @datopian/awesome-data?

And thus the acceptance criteria item 2 becomes:

  • @datasets/core-datasets exists with the repo contents of current @datopian/awesome-data and ⭐️ 37

Analysis

repo exists stars
datasets/awesome-data No 191
datasets/registry Yes 191
datasets/core-datasets No 39
datopian/awesome-data Yes 37

Research: Move issues in bulk

Bulk move of issues is not possible in GitHub, found 2 tools (nr.1 best):

  1. https://github.com/ahadik/git_mover

    Merging repositories. If you want to combine issues from multiple repositories into a single one, this tool does its best to handle name clashes where they matter. It'll even keep assignees on issues if that user if found on the source and destination repo.

    For more detailed info see: http://www.alexhadik.com/blog/2016/5/26/migrating-github-repositories-with-gitmover

  2. https://github.com/ahmadnassri/github-bulk-transfer

@rufuspollock For the move from @datopian/awesome-data to @datopian/core-datasets to @datasets/core-datasets, I selected the following for Team Access rights, is that ok?:
image

@rufuspollock
In the acceptance criteria (item 2):

@datasets/core-datasets exists with the repo contents of current @datasets/registry and ⭐️ 39

There is no @datasets/registry currently existent and it does not have ⭐️ 39. So did you mean @datopian/awesome-data?

And thus the acceptance criteria item 2 becomes:

  • @datasets/core-datasets exists with the repo contents of current @datopian/awesome-data and ⭐️ 37

Analysis

repo exists stars
datasets/awesome-data No 191
datasets/registry Yes 191
datasets/core-datasets No 39
datopian/awesome-data Yes 37

It must be so, so I've gone ahead and implemented the change. Will already update this in acceptance criteria as well

Situation

In acceptance criteria it says:

It has all (open) issues in @datopian/awesome-data (without their labels)

In tasks it says:

Move all the open issues from @datasets/core-datasets to @datasets/awesome-data
Don't worry about labels for now as not systematically used we can add back "later"

##Question

Would you like to have labels there or do you specifically don't want labels there?

Hypothesis

Yes, move labels. I think because it was assumed labels should not be moved because it was a lot of work.

@svetozarstojkovic Do you know where to configure where datahub.io/collections pulls data from?

It should be reconfigured so that it now pulls from datasets/awesome-data instead of datopian/awesome-data (it redirects at the moment so it still works but should probably change it).

Former /datopian/awesome-data/ issue list

Now this repo is named /datasets/core-data/ and the issues have been moved to /datasets/awesome-data/

github com_datasets_core-datasets_issues_utf8=%E2%9C%93 q=is%3Aissue
github com_datasets_core-datasets_issues_page=2 q=is%3Aissue utf8=%E2%9C%93
github com_datasets_core-datasets_issues_page=3 q=is%3Aissue utf8=%E2%9C%93

@glgoose this item is ticked

Contents of repo (files) are from @datopian/awesome-data

As is

@datasets/core-datasets exists with the repo contents of current @datopian/awesome-data

But this is not the case afaict. the idea was to force push over current repo with the relevant contents. have you done that?

image

image

image

@glgoose I've now fixed this by force pushing backup clones of these 2 old repo (specifically made in case something went wrong like this!)

@svetozarstojkovic i've also fixed the code on datahub.io frontend - can you get this redeployed as asap as pages are 404'ing atm https://datahub.io/collections/bibliographic-data

@rufuspollock
Done, please check it out.

FIXED. All done.