Migrating a library Godeps.json with transitive deps

Question

Migrating a library Godeps.json with transitive deps

sttts opened this issue 8 years ago · 25 comments

In the process of adding dep support to k8s.io/client-go and friends I hit the following problem:

k8s.io/client-go depends on many packages that do not support golang/dep (= do not ship a Gopkg.toml). Some of them are not direct dependencies of k8s.io/client-go itself, i.e. client-go's Gopkg.toml cannot use constraints. At the same time, client-go is not the root package, but users depend on it using golang/dep. Hence, client-go's overrides have no effect either.

In addition, client-go's dependencies are complex enough that we want to avoid putting the burden on the users to declare dependencies themselves.

How can we provide those dependencies to the users while the packages we depend on do not support golang/dep?
Even if all dependencies support golang/dep some day, I claim that there will be situations when we cannot trust them or just have to override them for technical reasons. Imagine client-go depends on foo v1 and foo depends on bar v2. Now bar v2.1.2 is released, but – while source code compatible – brakes a feature of foo. In client-go we would have no way to restrict to <v2.1.2.

While (1) suggests that we might want to follow overrides in dependencies until the direct dependencies ship Gopkg.toml, (2) suggests that the problem won't go away in the future with full golang/dep support throughout all packages.

Answer 1 · 2017-09-06T11:15:05.000Z

This seems to be a blocker for any further work towards supporting golang/dep in k8s.io/client-go. Ideas how to work around it, are very welcome.

/cc @carolynvs @sdboyer

Answer 2 · 2017-09-06T15:17:15.000Z

I wonder whether we need something like primary and secondary overrides:

the user defines primary overrides for his/her root project,
imports may define secondary overrides.

Primary always trump over secondary ones. If secondary overrides conflict, the resolution is failed and the root project has to deconflict by defining primary overrides.

In other words, we have only two layers, not a complete hierarchy.

Answer 3 · 2017-09-07T12:17:11.000Z

I think I just hit this vendoring k8s.io/client-go and apimachinery. I got the correct level of client-go (4.0) and the wrong level of apimachinery (master) that it depends on. The write up in the description here explains the problem well overall.

@lavalamp this is going to keep dep from being a reasonable contender for client-go for a while.

Answer 4 · 2017-09-07T12:57:24.000Z

@deads2k if we add a native Gopkg.toml we can actually restrict the direct imports of client-go correctly (using a revision constraint). The issue here is one step further: there are (transitive) imports from our dependencies which we don't control:

E.g. client-go -> apimachinery -> docker/docker -> docker/runc. We control the first two and can create a Gopkg.toml for each. Then everything up to docker/docker is specified. But we cannot specify docker/runc, but have to trust Docker adding Gopkg.toml for their projects. If they do that one day, great.

But if we find an issue in docker/runc we cannot override that version either, but depend again on Docker doing this. If the issue is not directly relevant for docker/docker, but for our usecase of it, we are screwed as Docker will refuse (for good reason) to follow our special requirement.

Answer 5 · 2017-09-07T14:37:02.000Z

oooook, buncha things here!

In addition, client-go's dependencies are complex enough that we want to avoid putting the burden on the users to declare dependencies themselves.

this is the second time this has come up as a significant point of frustration, with little feasible recourse - shurcooL/github_flavored_markdown#12 /cc @shurcooL @rtfb. i also believe that complexity-gatekeeping in this way is an important value that projects can and should be able to provide, at least in some way. so yeah, i'm bumping this up my priority list of things that really need fixing.

I wonder whether we need something like primary and secondary overrides:

i think the property you're really hunting for here is the transitive power that we give to overrides, but not constraints. this is something we could do - we could even extend the power to constraints. or have an extra value within a [[constraint]] that allows it to say, "this should apply transitively." but at least one problem is always going to be that...

hmm. actually. i've not thought this all the way through and am just spitballing, but if we make the transitivity flag explicit, then it could effectively mitigate the harms here without many of the unintended side effects described in https://gist.github.com/sdboyer/b0813bf2b9dba58a335a85092085472f. those side effects occur if we make constraints always transitive, because we can't distinguish between old, decrepit constraints and ones that are actually meaningful, which could end up creating an ecosystem with lots of "ghost" conflicts - conflicts between old constraints declared by projects that they never cleaned up. maybe those constraints are on a project that doesn't even end up in the depgraph. (i realize that sounds extra-absurd, but it could be damaging to solving efficiency if we try to "defer" mutual constraint agreement checks until a package actually shows up. not sure.)

but if a project can specify a flag that a constraint is transitive...yeah, this needs more whiteboarding, etc., but unless i'm missing something, this could work well.

this, though:

Primary always trump over secondary ones. If secondary overrides conflict, the resolution is failed and the root project has to deconflict by defining primary overrides.

at first glance, this basically just seems like an arms race to me: "they made a constraint, so we fired back with a secondary override, and now y'all need a primary override to settle the dispute." i don't think that direction is sustainable.

in any case, right now:

How can we provide those dependencies to the users while the packages we depend on do not support golang/dep?

the simplest thing you can do, for now, is to hoist up these packages as your own direct dependencies, so that you can constrain them as normal. ordinarily that means a required field in Gopkg.toml, but required is a power only afforded to the root project, so if your goal is to make things better for your dependers, then you'll need to make a dummy import somewhere:

package whatever

import _ "github.com/docker/runc"

the limitation there is that your constraint will only be activated if your dependers import the package in which that declaration exists. that may or may not be obnoxious in the k8s.io/client-go case.

Answer 6 · 2017-09-07T14:59:40.000Z

but required is a power only afforded to the root project

I was excited about the described workaround until I read this sentence.

For the other idea with the imports: we have many packages in client-go (and apimachinery, and apiserver, ...) and we would have to add those artifical dummy imports everywhere. While technically feasible, this also means you get the whole dependency chain for the smallest program (go build times will shoot to the sky I fear). On the other hand, I am happy that dep ignores constraints for packages in not-imported packages.

Answer 7 · 2017-09-07T15:01:25.000Z

i think the property you're really hunting for here is the transitive power that we give to overrides, but not constraints. this is something we could do - we could even extend the power to constraints. or have an extra value within a [[constraint]] that allows it to say, "this should apply transitively."

Yes, this is what we look for.

I wonder what you have in mind with this flagging of transitive constraints? This sounds like my primary-vs-secondary idea that you want to change the logic for them in case of conflicts?

Answer 8 · 2017-09-07T15:05:31.000Z

One thought, not to forget: transitive constraints inherited from a dependency on packages that your root package does not import, must still be ignored. Example: docker/runc above if my root package does not depend on docker/docker.

In other words, this behaviour must be different to overrides (which are unconditional by definition). So I agree, what we need something which is more like a transitive constraint than an override that is inherited from a dependency.

Answer 9 · 2017-09-07T17:38:17.000Z

@deads2k I filed kubernetes/apimachinery#23

Answer 10 · 2017-09-18T13:18:05.000Z

Maybe I am missing something, but wouldn't [[constraint]] + [[override]] work even if your project doesn't use that dependency directly?

[[override]] pulls right dependencies when you are building project
[[constraint]] make any parent project to choose transitive dependencies within your constraint

EDIT: should have tried before replying :) [[constraint]] have no effect if your project doesn't use it. why is that?

Answer 11 · 2017-09-18T18:31:21.000Z

[[constraint]] has no effect if your project doesn't use it. why is that?

Two options:

Use [required]=["github.com/foo"] but you can't specify the version or anything.
Use a dummy import, e.g. imports _ "github.com/foo" and then your constraint will work.

Answer 12 · 2017-09-18T18:39:39.000Z

What is downside of respecting [constraint] of transient dependencies without using import _ hack? Would you accept patch?

BTW, import trick doesn't work reliably, seems that you'd need to add it to every subpackage, which can possibly imported by parent project,

Answer 13 · 2017-09-18T18:41:48.000Z

@redbaron Care to open a new issue? These questions are about how to use dep, and aren't related to the original post.

Answer 14 · 2017-09-18T19:07:09.000Z

@redbaron the downsides of simply making [[constraint]] transitive are discussed and described extensively in this issue, including links off to further documents.

this is now the third issue where you've dropped in with questions or overly simple suggestions that are already addressed/eliminated upthread. all that does is make work for maintainers to reiterate information that was already readily available. you've gone so far as finding a relevant issue - please, do us the courtesy of reading and absorbing the information at hand before commenting.

Answer 15 · 2017-09-20T18:13:37.000Z

@carolynvs:

Use [required]=["github.com/foo"] but you can't specify the version or anything.

This elevates the dependency to a direct dependency, and therefore [[constraint]]'s should be respected. Or am I missing something? A bit ugly, but it should work.

Answer 16 · 2017-09-22T04:22:32.000Z

ok, getting around to a more thoughtful response now.

but required is a power only afforded to the root project

I was excited about the described workaround until I read this sentence.

yknow, it doesn't actually HAVE to be root-only. requireds are almost functionally indistinguishable from imports. almost, because people can use required to pull in main packages...which is probably a pretty good reason to keep them root-only.

but i think that's actually a red herring here, anyway.

I wonder what you have in mind with this flagging of transitive constraints? This sounds like my primary-vs-secondary idea that you want to change the logic for them in case of conflicts?

in terms of the user-facing bits? i was just imagining this, in Gopkg.toml:

[[constraint]]
name = "github.com/shrimply/pibbles"
version = "^1.0.0"
transitive = true

One thought, not to forget: transitive constraints inherited from a dependency on packages that your root package does not import, must still be ignored. Example: docker/runc above if my root package does not depend on docker/docker.

In other words, this behaviour must be different to overrides (which are unconditional by definition). So I agree, what we need something which is more like a transitive constraint than an override that is inherited from a dependency.

this isn't impossible, but it's potentially quite costly. i'm gonna go a ways into the details of the solving algorithm to explain, as i've been meaning to think these things through for a while, and here's as good a place as any to write them down.

so: there are several critical issues, all of which ultimately stem from the order in which we explore the graph. if we used something more conventional, e.g. depth-first search, then a convenient side effect would be that the visit stack itself would provide the scoping we need to tell if a transitive constraint is in effect. (tbh though, i'm not actually sure the algorithmic complexity costs would be any better - just, conceptually simpler)

but, an unguided search like BFS or DFS is highly inefficient (having a much higher likelihood of costly backtracking) when compared to a visit order that takes advantage of what we know about the domain. unfortunately, using this more complex order means that the visit trail no longer tells us anything about reachability. and, because we've not had any need of transitive reachability calculations thus far within the algorithm, we don't keep that information on hand in a terribly efficient form.

so, here's the first problem: the naive approach would add a check at each step in the algorithm that does a naive e.g. DFS on the backlinks we keep (the map in that previous link) in order to determine if a [[constraint]] marked transitive = true should actually be applied. that translates to coefficient on the n of an O(2^n) algorithm that is proportional to n.

the second problem is actually a generalization that entirely subsumes the first, and it's much nastier: say A declares a transitive constraint on D, and the only connecting path is A -> B -> C -> D. only once we have visited all four of those can we establish that D is reachable from A -
but it is quite conceivable that visit order would be A, D, C, B, meaning that the moment where we can first know that A's transitive constraint on D needs to come into force is when we connect B and C. we now have a failure mode that requires a whole graph search to detect; currently, we only have to check “local” information about the project we’re evaluating - the imports and constraints it introduces.

this means that as long as there are any known transitive constraints where reachability has not yet been established, then every time we select any new node, we must perform a search to determine whether that new link establishes reachability. maybe we have to do DFS for each of these, but it seems like an online connected components algorithm might be better, as we could keep it up to date at each solving step, then query it as needed. the best i've found in my brief literature searches for that are polylogarithmic for connectivity queries, updates, and deletions.

those algorithms are fully dynamic (support arbitrary add/remove of edges) though, which we may be more than is necessary for our purposes. we need edge addition, of course, to represent that we've selected a new project at a particular version. but we don't actually need fully dynamic edge removal - we just need to be able to "undo" an edge addition. that suggests we may want a persistent datastructure, as they're brilliant at undos - just keep a stack of pointers to previous versions, pop them off when you backtrack, and let GC take care of reaping unused segments. classic FP.

so, if we only need edge addition, then that's "incremental connectivity". union-find is sufficient for that, and its optimum is nearly linear (in time and space, both average and worst), and it seems there are persistent union-find algorithms which claim equivalent complexity costs.

(note that this algorithm would also probably ideally be what we'd want to use for #439)

however, even assuming that we have a working algorithm in hand, there's another cost. say that we're assessing dependency X to see if it's satisfiable. if it's not, the set of projects "involved" in the failure will look like one of the following:

{X, Y, ...} - X, Y, and maybe a bunch of others, disagree on something - e.g., Y imports a package from X, but X doesn't have that package, or X and Y have disjoint constraints on Z
{Y, Z, ...} - the addition of X reveals a disagreement between Y, Z, and maybe a bunch of others. so, the case this issue is concerned with

very few of our current failures are of the latter form. in fact, the solver doesn't even really support the latter properly; the assumption is that if checks failed while visiting X, then X is part of the problem, rather than just being the messenger. i'm sure that's fixable, but it'll be a bit of complex rabbit hole on its own.

the prospect of adding the above capabilities does excite me, even if only from an "OOOOH ALGORITHMS 🎉 🦄 🌈 " perspective. but, given all of the complexity involved in upholding a strict definition of transitivity, it's really worth considering whether it might be worth it to just treat them globally in the same way we do overrides today. i think the chances of problematic conflicts here are very low, as it would require:

Some project, A, to declare a transitive constraint on C, which is brought in by B
B to stop relying on C
A to not notice (or not be updated) that B no longer pulls in on C, and leave its transitive constraint around
D to import A, and also C
A's constraint on C to just not be workable for D

it's those second and third steps that seem most unlikely to me. and we can even warn A's author about having an ineffectual constraint in the same vein as #302 - just, after a whole solution is computed. also, doing it this way admits strictly fewer possible combinations than the more nuanced, truly transitive approach, and it's always much easier to ease up later than it is to ratchet down, as we can know for certain that we won't be invalidating anyone's current solutions with the rule change.

last note - if we do get around to the more complex implementation, i might want to experiment doing it with multiple solver passes. e.g., let the first pass run with only less expensive checks that we do today, but if it fails, inspect the solution for any phantom conflicts arising from treating these constraints as global instead of truly transitive. if so, automatically re-solve, but with the heavier-duty checks.

Answer 17 · 2017-10-02T00:04:34.000Z

@sttts if i do up an experimental branch where we add support for the simplified transitive-as-global approach, are you able to invest some time in exploring its utility for your use cases?

Answer 18 · 2017-10-02T07:17:52.000Z

I don't understand why transitive is a property Go developers should have to care about.

In my opinion dependencies are a set, not a graph. Figuring out the allowable versions in that set may require a graph, but from the final product; which revisions of the source code are placed in vendor/, this is a set. Why should a package being directly imported by code in my project be different to a package who's types are returned from code that I directly imported from my project?

As I said in #1231, having [[constraint]] flip between working and non working states simply because the package is directly referenced via an import statement in my project, is very hard to debug.

Answer 19 · 2017-10-18T09:01:58.000Z

if i do up an experimental branch where we add support for the simplified transitive-as-global approach, are you able to invest some time in exploring its utility for your use cases?

@sdboyer finally finding time again to work on this. Yes, such a branch would be great and I am happy to prototype the client-go use-case for that feature to validate the idea.

Answer 20 · 2017-11-15T03:22:55.000Z

just wanted to update this to say that i'm still chewing on this, a lot. i'd written a whole big response, then kinda chucked it. even a simplistic, all-global implementation for experimentation purposes requires some nontrivial refactoring.

so, very much not forgotten. quite the opposite. just, a lot to mull over.

Answer 21 · 2018-04-17T10:19:31.000Z

Hi guys,

I struggled for several times to make this work and now I would like to share the solution that works for me:
Use this guide, but beside of this you have to put in your Gopkg.toml an [[override]] stanza with the k8s.io/api wanted branch/version.

Over here you can see my client-go related dep config:

...
[[constraint]]
  name = "k8s.io/apimachinery"
  branch = "release-1.8"

[[constraint]]
  name = "k8s.io/client-go"
  version = "5.0.1"

[[override]]
  name = "k8s.io/api"
  branch = "release-1.8"
...

Answer 22 · 2018-04-17T10:30:30.000Z

@andreic92 unfortately this is not enough. It misses all transitive dependencies. Moreover, you mix branch heads and the tagged client-go. Use branches only or tags only (e.g. via the kubernetes-1.8.x tags, which exist for all repos).

Answer 23 · 2018-04-17T10:49:15.000Z

@sttts Thanks for the observation regarding the mix between branch heads and tagged entries, I fixed it.

For a demo project, this is my full Gopkg.toml:

[[constraint]]
  name = "k8s.io/apimachinery"
  version = "kubernetes-1.8.1"

[[constraint]]
  name = "k8s.io/client-go"
  version = "5.0.1"

[[override]]
  name = "k8s.io/api"
  version = "kubernetes-1.8.1"

Also see tree -L 2 vendor output:

.
├── github.com
│   ├── PuerkitoBio
│   ├── davecgh
│   ├── emicklei
│   ├── ghodss
│   ├── go-openapi
│   ├── gogo
│   ├── golang
│   ├── google
│   ├── googleapis
│   ├── gregjones
│   ├── howeyc
│   ├── imdario
│   ├── json-iterator
│   ├── juju
│   ├── mailru
│   ├── modern-go
│   ├── petar
│   ├── peterbourgon
│   └── spf13
├── golang.org
│   └── x
├── gopkg.in
│   ├── inf.v0
│   └── yaml.v2
└── k8s.io
    ├── api
    ├── apimachinery
    ├── client-go
    └── kube-openapi

As can be seen, the transitive dependencies seems to be in there, but the problem that is visible in Gopkg.lock is that the revision attribute for the transitive dependencies are not the same with the ones required by client-go@v5.0.1.

I guess that everything is working for me because there are no breaking changes for the fetched dependencies, maybe.

Maybe, adding [[override]] stanzas for the mismatched dependencies would work?

Answer 24 · 2018-04-17T11:48:11.000Z

I guess that everything is working for me because there are no breaking changes for the fetched dependencies, maybe.

Yes, it's working reasonably well if you have a client-go that is not too old. But we had problems before that transitive dependencies suddenly broke our code, and people newly vendoring client-go were running into issues. It's an unfortunate situtation, but we cannot do much about it.

Answer 25 · 2018-04-17T18:54:02.000Z

I have an open PR to add transitive constraints to dep, the maintainers are all committed to getting that feature in soon, along with preferred versions. So hopefully we can close this gap in dep but yes for now there's no way for the k8's client to protect people from this. 😢