andk/pause

When a release is successfully indexed, remove all entries relating to previous relases of the same dist

Opened this issue · 4 comments

neilb commented

If you (a) rename some packages in your dist, or (b) drop some from your package, then when you do a new release, the old package names will stay in the index, until you delete older releases from your author directory.

When a release is successfully indexed, I think PAUSE should delete any references to older releases. There are at least two benefits:

  1. This prevents someone from being able to accidentally install an older release, possibly overwriting a more recent release
  2. It removes cruft from the index

It's the second case which is motivating me to submit this. Over the years, references to old releases has been the cause of various issues when I've been tidying things up. A number of times when I've spotted this sort of issue, I've contacted the author and asked them to delete the old releases, offering to do it on their behalf. I don't remember anyone resisting this.

I've raised this at least once, possibly twice, at past PTSes, in discussion with Andreas. He wasn't keen on it, but I can't see what the counter argument for keeping them in the index is. If there is a valid one, then at least it will get documented on this issue. I had a look through closed issues, to check there wasn't an existing one for this, but if there is, I couldn't find it.

rjbs commented

Andreas and I discussed this briefly, as I recall, in the context specifically of older perl uploads, and he mentioned the value of the history. It was a small topic, so I didn't think about it much.

I agree: I would rather that replacing a dist replaces the previous version entirely.

@andk Do you have a position on this, other than the one I remember only poorly?

neilb commented

By coincidence, I was thinking about this again on the walk to work this morning. I was thinking that maybe there's an argument about historical releases, but given that many authors fairly promptly delete old releases, there's very patchy coverage of old releases. The index doesn't reference BackPAN, so it really only guarantees (as much as anything is guaranteed) entries for the most recent stable release.

In summary, (a) I can't see a positive case for keeping patchy coverage of old releases in the index, (b) I can't remember a scenario where having an old release in the index was beneficial, but (c) it has caused confusion and clashes multiple times in the past, which I've had to resolve.

My concern would be that this would result in some behavior changes that are not clearly beneficial. For example, if a new release no longer includes a package name, that package will become unindexed, rather than installable via the older release. As you say installing older releases is fraught but it's not clear to me that it's correct to remove it automatically.

Additionally, this suggested behavior is quite dependent on viewing distributions as a cohesive unit; in the current data model, it's quite impossible to determine for certain what "previous releases of the same distribution" are, because PAUSE does not have a concept of a persistent "distribution". Consider a release consisting of packages that previously have been indexed in two or more other distributions, for one of many examples of where this model does not work in practice. Or more concretely: say a package from the perl distribution is made dual-life, and thus gets its own indexed release to CPAN - this should certainly not cause perl to be unindexed.

ap commented

PAUSE does not have a concept of a persistent "distribution".

And I want here to submit for consideration that PAUSE does not because neither does Perl.