haskell/cabal

Adding support for revision pinning in dependency version

Kleidukos opened this issue ยท 22 comments

In order to make another step towards more reproducible build plans, @hdgarrood, @tfausak and I talked about the possibility of specifying version strings like:

== 2.0.0.0-1

where the -1 specifies a revision.

This asks some questions like "What should be the behaviour when the package repo doesn't have such a revision?", and I think this also questions the way we declare dependencies nowadays (but that's for another time).

cc @emilypi

Here's a link to the thread on Twitter: https://twitter.com/TechnoEmpress/status/1462903466788069379

Although Stack can currently pin revisions in stack.yaml files, I'm aware of at least a couple problems with this:

  • There's no standard syntax for adding revisions to version numbers. Is it 1.2.3.4-5 or 1.2.3.4-r5 or 1.2.3.4 (r5) or something else? I feel like 1.2.3.4-5 (that is, version-revision) feels the most natural.
  • The actual revision number isn't necessarily stable. Meaning some-pkg-1.2.3.4-5 isn't guaranteed to always be exactly the same. That's why Stack includes a hash as well by default.

That being said, I would appreciate the ability to do this.

Yeah, maybe being able to include a hash would be best.

I don't think this is the right thing to do. Specifing revision numbers in the dependencies is contrary to the spirit of revisions, which is to be able to fix up metadata issues without a new package upload.

Also, I don't think what is proposed here makes much sense: pinning dependencies to their exact versions makes a library mostly unusable because it can only be used if all packages in the plan agree with the same exact versions (it might work ok if only one library does it, but it two libraries do the same, the chances that their requirements match drop quickly).

That said, I do care a lot about reproducible builds and I think cabal-install today can already guarantee a decent level of reproducilibity (using index-state for example).

Is there a use-case where index-state and/or freeze files don't provide adeguate reproducibility?

Specifing revision numbers in the dependencies is contrary to the spirit of revisions, which is to be able to fix up metadata issues without a new package upload.

I disagree.

I've had a production CI fail, because of a revision update years ago. It caused an incident. Someone made the bounds tighter, because on one of many platforms (that we didn't use), the configuration wouldn't build. We can argue whether it was a good or a bad update, but in general, bad revision updates are absolutely possible.

The index state is not a sufficient alternative, because it's across the entirety of the index and very unwieldy.

Managing to pick your index state in a way that it touches exactly a set of specific revisions is going to be impossible in the general case.

What we need is to be able to tell the cabal solver "here, take an older revision", because freeze files do not disable the solver.

@hasufell A --prefer-older that would take not the latest but penultimate version/revision?

@hasufell A --prefer-older that would take not the latest but penultimate version/revision?

I'm interested in a cabal freeze file workflow.

I might generally be interested in revision updates, but might also want to freeze bad apples or freeze all, because it's easier to experiment with versions when you are NOT freezing the index state.

@Kleidukos Ok, I think misunderstood where you want to pin the revisions. I thought it was in the cabal file! now I believe you meant pinning them in a project freeze file.

@hasufell

The index state is not a sufficient alternative, because it's across the entirety of the index and very unwieldy.
What we need is to be able to tell the cabal solver "here, take an older revision", because freeze files do not disable the solver.
[...]
I'm interested in a cabal freeze file workflow.

I am very interested in this workflow too, although it's not quite clear to me yet how it would work and what changes it would require. Is there a better place to discuss this?

it's easier to experiment with versions when you are NOT freezing the index state.

Help me understand what you mean by this. You would like to keep the cabal files fixed while also allowing new versions? Is this what you mean?

Coming back to the original issue, I don't think it's feasible.

gbaz commented

I think this absolutely makes no sense for cabal files. However, the case hasufell presents for project files is potentially feasible, though it would take some care to implement well.

I'm interested in a cabal freeze file workflow.

Freeze all versions and enable allow-newer: true. Now you are invincible to revisions tightening version bounds.

I'm interested in a cabal freeze file workflow.

Freeze all versions and enable allow-newer: true. Now you are invincible to revisions tightening version bounds.

Sounds like that could actually lead to a build failure if the revision update introduces a new dependency that's not frozen yet?

Sounds like that could actually lead to a build failure if the revision update introduces a new dependency that's not frozen yet?

I do understand that you feel you need to protect yourself from other people's mistake; in this case from possible mistakes made by either the package mantainers or the hackage trustees (the two entities who can make revisions). Nevertheless, I think you need to draw a line somewhere; any change in anything can potentially break your build. Revisions should never turn a working plan into a non-working plan (into a different working plan, yes).

I see index-state as a protection againt unplanned changes which I will still have to adopt (and adapt to) at some point.
Freezing all versions (and perhaps editing those versions manually if needed) should be close to the freeze-file workflow you mentioned. If you see an issue with this please be specific.

I don't want to sound dismissive though, I wish we could easily do things like lock a build plan and read it without passing through the solver, unlock only parts of it and asking the solver to fill the gaps. I wish we had a better way to fetch information about packages than downloading a 100MB tarball, etc, etc; but there are few steps to get there.

I see index-state as a protection againt unplanned changes

As I explained, index state is not the same. It is very inflexible and mainly covers use cases where you don't fine-tune or manually bump your pinned deps.

If you've played with parts of the cardano ecosystem, you'll notice you generally have only two choices:

  1. use nix
  2. manage your project/freeze files properly

Index state is not a substitute.

Is this a feature common users will want to explore? Maybe not, but that doesn't matter.

If you've played with parts of the cardano ecosystem ...

I have been working in that ecosystem for more than a year. We are now doing packages, pvp and revisions. Most of projects in IOG don't use freeze files but index-state and control their dependencies using simply using version bounds.

I welcome you to reach out in private if you want to continue this discussion.

Index state is not and has never been a method to select revisions.

A bad revision would effectively stop you from bumping the index state.

Frankly, I don't care what revisions are supposed to achieve. They're something that can change, and as a user of Cabal I feel that I should have a way to select them. You might as well argue that we shouldn't need to specify patch version numbers since it should always be safe to take the latest. That's true in theory, but sometimes practice doesn't match theory.

I agree with @hasufell about index-state. It happens to freeze revisions, but it isn't really a method to select revisions. It's all-or-nothing for the entire index at once.

Could we use the same syntax as Stack? For example with smtlib-backends-0.3 for revision 1, either of these:

==0.3@sha256:69977f97a8db2c11e97bde92fff7e86e793c1fb23827b284bf89938ee463fbf0
==0.3@rev:1"

If we introduce this feature (we'd probably need somebody that badly needs it to argue for it and then implement), the Stack notation looks like a good choice.

If we add revisions then == can really mean version equality.

Switching to a source-repository-package dependency can give us back an exact version equality (pin by hash) but this is more work. As far as revisions go, there's nothing that forces a package maintainer to commit the hackage revision change to source control, is there? A forced push or deleting the repository can destroy the commit hash but hackage versions and revisions are always there.

gbaz commented

I don't think that having revisions and sha hashes in the same syntax makes much sense, precisely because the latter only makes sense with source-repository dependencies.

It does look like revisions can be specified with a hash, from smtlib-backends-0.3/revisions:

image

AFAIK hackage cannot look up cabal files by their hash (yet!), only https://casa.stackage.org/ can do that.

I still strongly believe specifying revisions in build dependencies does not make any sense. Packages are source distributions, reproducibility is a property of the build process, which is controlled by the project configuration.

In the context of a cabal project, index-state might not offer the flexibility that some comments here demand. Nevertheless, the effects of any broken revision can be fixed with a combination of allow-newer/older and constraints (but there are cases where this are not quite enough or too complex to figure out).

If we are talking abount project configuration, specifying a specific revision directly could make sense and I can come to compromise. I am thinking that, behind the XY problem, we face a similar issue: bad revisions can break build plans just as much as they are needed to fix build plans. So the issue I see is fixing build plans.

The corner cases I mention above (sorry I don't have link now) where leading us to consider "conditional constraints" (i.e. constraints that apply only if a specific package-id is choosen). This is effectively the same as changing the version ranges in the build-depends for that package name and version; which is also roughly same as what revisions can do.

tl;dr:

  1. specific cabal file revision/hash in package description build-depends, no way
  2. specific cabal file revision/hash somewhere in cabal.project, maybe
  3. something like constraints: smtlib-backends-0.3:override:base>=4.14&&<4.19 can be useful to fix broken build plans, so ๐Ÿ‘ Note thay allow-newer: smtlib-backends-0.3:base would have the same effect in cases like this.