CLD2Owners/cld2

Cleanup old data sets as was done for Chrome

andrewhayden opened this issue · 0 comments

We should consider cleaning up the old data sets that are kicking around in the main tree, like we did for Chrome a while back in commit b2c2d34. Old data sets should, in my opinion, be stored one of two ways:

  1. In subdirectories whose names indicate versioning (e.g., for temporal versioning YYYYMMDD)
  2. In git history (i.e., only the most recent data set is maintained and kept compatible)

I don't see a strong reason to keep the old data sets around. Folks that want to use the old data sets should be welcome to do so by checking out an older revision. Visibility is an issue; For now we could create branches for the old data sets that exist today, simply pruning the irrelevant data sets in each. After those are done, we can do development of code fixes on master and cherry-pick

I'd suggest branches:
legacy_release_0122 (delete any data files not suffixed with 0122)
legacy_release_0527 (delete any data files not suffixed with 0527)
legacy_release_0720 (delete any data files not suffixed with 0720)

Chromium doesn't need or care about the old data sets, which is why the version suffixes on all the "chrome" files were already deleted. It's not worth putting them back in for the old releases.

The current release is from 2014-10-15, so we would also have a release branch called release_20141015 and we would maintain (at least) this branch with cherry-picks from master.

WDYT?