glide -> gps transition
sdboyer opened this issue · 16 comments
As discussed in #384, there's a number of things to consider about transitioning glide onto gps. I figured I'd open this issue as a place to start just enumerating them, and we can decide where/how to split them all out later.
I intend to edit and update this list as I go, though, so probably better to think of this as a meta-TODO list, where we can check things off as we talk through them and agree on something, implement something, or whatever other solution we come to.
(This list is unlikely to be exhaustive until everything's actually checked, as it's a running checklist that I'll add to as we discover more things)
- Cannot specify 'vcs' param; this is deduced, or doesn't work. At most, it becomes a deducer question
- subpkgs (in manifest) no longer serve a purpose and are going away. They may remain in the lock, though.
gps doesn't currently support that (sdboyer/gps#40), but adding support is pretty trivial (we have all the information). - gps makes hard assumptions about things being available upstream, else fail. (So, if something's available in vendor, but not accessible upstream, it's a failure, whereas arguably it could/should be satisfied by what's sitting in vendor now).
-
glide.yaml
key is changed: s/Import
/Dependencies
/. It might be even more semantically accurate if it wereDependencyConstraints
orConstraints
- It seems like glide's current
ignore
andexclude
inglide.yaml
could both just be folded into gps' ignore, but I'm not fully sure. - Selective arch/os support simply isn't present in gps, but the foundation's in place for a much more complete system than glide's current one. This'll be solved by sdboyer/gps#44
- Also solved by sdboyer/gps#44 is annotating in the solution/lock when a given dependency is only necessary for running tests (which glide currently does).
- There's currently no support for explicitly grabbing a pkg not named in the import graph (sdboyer/gps#42)
- This affects
glide get
, b/c it's kinda a no-op if it's not in the import graph - apart from updating the manifest. As described in gps issue, though, I think we could maybe safely defer that till later? - In the same vein, if a dependency is listed in
glide.yaml
, but is not present in the import graph, it will NOT show up in the result.
- This affects
- User-facing output will be significantly different, most notably in that gps doesn't report that it's doing any repo fetching or management as it does it. We could build a communication channel for it if it's crucial, but it just doesn't exist now.
@sdboyer going to start asking some questions as I dig through these. Here's a start:
- Is the detection in Masterminds/vcs the level of detection supported or is there more? If there's more can you give me a pointer. I ask because the
vcs
property is used regularly by some big Glide users. I ended up needing to use it this morning. - If the name is changing from s/import/dependencies what's the handling for backwards compatibility? There are thousands of
glide.yaml
files in existence. - Did you work it out so varying locations can point to the same package and this work cross projects? So project A could pull package C from its source and project B could pull package C from a fork and the resolution just works.
Is the detection in Masterminds/vcs the level of detection supported or is there more? If there's more can you give me a pointer. I ask because the vcs property is used regularly by some big Glide users. I ended up needing to use it this morning.
Yes, there's considerably more (Masterminds/vcs detection isn't used at all). It's wrapped up in a larger system, contained within the SourceManager
. There are several public-facing entry points, but they unify at SourceMgr.deducePathAndProcess
, and hten SourceMgr.deduceFromPath()
. The latter's internals will look more familiar to you; the former is basically just negotiating the futures returned from the latter.
There's a lot of logic here, all there for considered reasons, but not all necessarily germane to the question of a vcs
property. So, to focus this discussion, the key question that I haven't had an answer for is, "what use case is there for an explicit vcs
property that can't be satisfied by having the vcs extension literally written in the import path?"
If I have a clear view on what that use case is, then I can figure out where it's appropriate to incorporate that control.
If the name is changing from s/import/dependencies what's the handling for backwards compatibility? There are thousands of glide.yaml files in existence.
Yep, there's a checkbox for this on #384 right now, and it's my currently in-progress code on my local. Basically, my approach is to have structs for the new version of the manifest and lock files, and structs for the old version. ConfigFromYaml()
and LockfileFromYaml()
then attempt the new versions first, and if they don't work, try the legacy version, and (if possible) automatically convert them into the new version. A third return value is added, indicating whether the legacy autoconversion took place, so that the user can be notified. (messages are not printed directly from the function, as we don't want those funcs printing when autoconversions are done on yaml/lock files in deps - only the root project).
Did you work it out so varying locations can point to the same package and this work cross projects? So project A could pull package C from its source and project B could pull package C from a fork and the resolution just works.
There are three answers to this: "No", "Yes," and "I'd like to but it's hard. All are true, depending on the specific constraints.
The "Yes" is because it's possible for the root project to indicate an override, which unilaterally chooses the location from which to source C. It could be what A says, or what B says, or something else entirely. This is a workable solution that gives the user the control they need to proceed with their own work, but it's not a scalable or healthy solution, because:
- That override has to be restated by every root project that encounters the conflict
- Once that override is written into a manifest, it's likely to stay around for much longer than it may be needed - (say, once B goes back to being fine with using the canonical source of C)
The "No" refers to the general case: given that "one import path, one package" is (at least for now) a requirement of the system, if A and B indicate that C should be sourced from a different location/URL, that is, at base, an irreconcilable conflict. There may be ways we can squirrel around that, but that brings us to...
"I'd like to, but it's hard"; while there are a bunch of strategies we could use to make this work in some common cases, they involve doing some acrobatics, and I want to tease out the specific constraints of those cases more before investing the necessary time.
For example, in the use case you gave, I think the ideal solution would be that if we treated A's request for the canonical C as a 'default', which could later be superceded by B's request for sourcing C from a fork, then that might work well. However, with the way the solver is currently designed, depending on ordering, things could look different:
-
If, while solving, we encounter B first, then we'd know to use the fork of C - we pull a list of versions from it, perform other compatibility checks (e.g., are required pkgs present?) against it, etc. Then, when we encounter A, we can compare the source URL requirements, say, "oh, A wants the canonical, it's OK to supercede with a fork," and continue on.
-
If, however, we encounter A first, then solving proceeds with canonical C. When we encounter B, we now know that we want to supercede with the fork of C. The problem is that the solver has (potentially) made a bunch of checks based on canonical C already. Correct operation of the solver is built around the idea that we've made a series of checks/choices, all of which are acceptable at the current step. So, if we're going to swap in a fork, we need to make sure all those checks we've already made would still be valid with the fork. And, in order to support backtracking - which is basically how most sane modern solvers fundamentally work - that transition from canonical -> fork has to be reversible.
We're more or less talking about defining "equivalency" between sources, which has a lot of dimensions if we're going to keep it safe. Frankly, the easiest way to address this is simply to restart the entire solver, with a marker to always use the fork for C. That's not a crazy approach, necessarily, but it has its own costs.
So, yeah - we have at least a short-term solution, but more use case exploration and refinement is needed.
"what use case is there for an explicit vcs property that can't be satisfied by having the vcs extension literally written in the import path?"
How would you handle a location on the filesystem? People use those right now for a handful of things. I didn't consider the file://
based locations until others brought it up from their use.
How would you handle a location on the filesystem? People use those right now for a handful of things. I didn't consider the
file://
based locations until others brought it up from their use.
As in the mirroring idea implemented in #547? Or as in "there's code, but no vcs to manipulate, at file://path/to/source
- use that for import path x/y/z"?
My answers are quite different, depending on which one you're curious about.
@sdboyer I have a hard time thinking I could go to the Glide power users, such as @akutz, and tell them that in glide.yaml
files and mirror settings they could not use file://
paths. Something they can do, and some already do, today.
What would you suggest telling these folks that they would accept?
I have a hard time thinking I could go to the Glide power users, such as @akutz, and tell them that in glide.yaml files and mirror settings they could not use file:// paths. Something they can do, and some already do, today.
What would you suggest telling these folks that they would accept?
...??
I don't understand why you wrote this. The question I asked was, "are you referring to the mirrors feature you just shipped, or to this other thing?" You responded with, "mirrors are an important feature, how do I tell people they don't get them?" Could you please answer the question I ask, particularly when it's a straight up either/or? Or, if it's not clear what I'm talking about (which I know happens often, and I'm sorry), ask for a clarification?
In any case, I take it that you want to discuss mirrors, not the other thing.
I didn't say that mirrors are something gps can't/won't support. Not at all. I was only referring to that vcs
property. But, I'm guessing you jumped ahead to mirroring because, in glide's current approach, that vcs
prop is necessary to discover the vcs type of the local "mirror." Right? If there's a different or additional reason(s), please tell me.
@sdboyer I have some questions/comments...
- I've tried using the code on the PR to update Glide itself and Helm. In both cases it failed with a panic. The panic on helm was
panic: canary - *should* be impossible to have a pkg-only selection here
which came right after← backtrack: no more versions of gopkg.in/yaml.v2 to try
. - Does this have anything to do with lacking a default version when one is not specified? Right now Glide tracks the default branch for a repo. When no version is specified that is used. This is a cached piece of information.
- Can you share why cache key names use a different strategy than what Glide uses today? I'm curious of the reason(s).
- When I try to run
glide install
with a legacy lock file I see it failing to make the conversion it says it's trying...
[WARN] glide.yaml was in a legacy format. An attempt will be made to automatically update it.
[WARN] glide.lock was in a legacy format. An attempt will be made to automatically update it.
panic: interface conversion: gps.Revision is not gps.PairedVersion: missing method Underlying
I keep running into trouble when I go to try it. Can you give me some pointers to help get up to speed more quickly?
(I've pushed a commit that should fix this panic: panic: interface conversion: gps.Revision is not gps.PairedVersion: missing method Underlying
- glide is now updating for me)
It's indeed possible that the panic with Helm there may be tied to issues with gopkg.in - I'll have to investigate more later. I've just opened up sdboyer/gps#97, which is about pre-filtering acceptable versions based on what the import path looks like - effectively, respecting gopkg.in's semantics. As much as that's possible, anyway.
Can you share why cache key names use a different strategy than what Glide uses today? I'm curious of the reason(s).
I actually wasn't looking at glide when I wrote it. It's based on simple string replacement - https://github.com/sdboyer/gps/blob/master/source_manager.go#L17, which is run on the results of url.String()
.
as the TODO notes, i'm not the happiest with it. looking at output, though, it seems to look like packagist's (while glide's looks a bit more like npm's).
Does this have anything to do with lacking a default version when one is not specified? Right now Glide tracks the default branch for a repo. When no version is specified that is used. This is a cached piece of information.
So, there's a few pieces, here. First is the background context - whenever we talk about version selection, defaults, etc. in gps, we're ultimately talking about the relative order of versions inside a queue. The order of versions in that queue is the order in which we "try" a version. "Trying" means checking it against version constraints (among other things), so it's not a problem if the correct version to pick isn't first in the queue - only that it's the first in the queue that satisfies constraints.
The way these queues get populated is a little complicated, but in the general case, gps uses this sort ordering on the set of versions returned from interrogating the vcs. That means, absent any constraint information, semver will always be tried first, then plain tags, then branches.
Now, the tool is in full control of the constraints that are used, so glide could effectively alter that behavior to prefer branches: whenever returning a manifest (whether for the root or a dep), if glide sees there's no constraint declared, it could put in a gps.AnyBranch()
(not yet implemented, but pretty easy - sdboyer/gps#72). I'd also have to do sdboyer/gps#65 for that to work, but it's not terrible.
That said, I'd like to suggest that it might be better for glide to prefer selecting a semver tag if one is available, and if not, then taking the default branch. That seems to me to be the world we want to nudge towards. (For that, I'd need to do sdboyer/gps#65 and sdboyer/gps#98)
sdboyer/gps#65 and sdboyer/gps#98 are now complete. I've rolled v0.11.0 now, after pushing back a couple of the things that were in it to v0.12.0. The only thing that's of immediate relevance here that I bumped is the the pre-filtering of available versions from gopkg.in ticket, sdboyer/gps#97.
There's a couple API changes in v0.11.0 that I'll need to resync into #384 before it'll work. Nothing huge, though.
Is there anything that I could pick up to help with this effort? I don't have a huge amount of time to dedicate to it, but if there was something I've got a chance of writing a reasonable PR for I'd be up for it.
@silasdavis sorry, i've been very swamped this week - but i'm thrilled that you're interested in helping! i'll try to find a bite-size chunk for you over the weekend.
@sdboyer bump (though no rush - I'm not going anywhere)
@silasdavis thanks for the bump. sorry, my work in this general domain has been really focused towards the new official tooling, gps, and semver - between that and the holidays, it's been hard to find time for glide directly.
The most useful thing to start with would be building glide from the gps-integration
branch, trying some things you're used to doing with glide, and providing feedback on what does/doesn't work, and where you're surprised by the tool's new behavior. Ideally, just keep a running stream-of-consciousness diary of everything you do as you do it, and your thoughts when you notice/run into things.
@sdboyer perhaps you could use me on gps?
@silasdavis i most surely could :)
i'm not sure that i can promise any particularly low-hanging fruit issues, but here's a starting list:
- I've generally put a lot of effort into testing gps, and while there are some powerful, expressive test harnesses, the tests are still far from perfect. sdboyer/gps#135, sdboyer/gps#109, and sdboyer/gps#63 are all testing-oriented - and hey, looking at tests is a great way to grok a project 😄
- If you're comfortable writing custom depth-first search algorithms (nothing terribly fancy), sdboyer/gps#127 could use an eye
- The implementation of sdboyer/gps#113 is likely to be pretty straightforward, though because I don't have a settled API in mind already, you'd need to wear a bit of a design hat on and spend some grok time
- sdboyer/gps#125 will likely have a very simple implementation, and understanding the "why" would take you a long way towards grokking gps
- sdboyer/gps#99 is likely to be a fairly intensive refactor, but it's something we definitely need to do and there's a lot of code out there to draw from
- If you like graph algorithms, we need cycle detection in sdboyer/gps#66. Trick is, it should ideally be an incremental cycle detection algorithm, rather than just brute-forcing with something like Tarjan SCC every time