🦁 Yarn 2.0 Working Group

Question

🦁 Yarn 2.0 Working Group

arcanis opened this issue 6 years ago · 45 comments

Alright @yarnpkg/core (and community!). Let's discuss what we want for the 2.0 release 🙂

Ideas "en vrac":

Drop support for Node 4 (#5736)
Revamp the command line implementation (switch from Commander to Yargs, remove the complicated custom logic as much as possible)
Harmonize the command line (an example in mind is the --pattern option, which sometimes is an option like for yarn list, sometimes a -P,--pattern option like for yarn upgrade, sometimes a --scope option that only takes the scope like for yarn upgrade-interactive, and sometimes an argument that doesn't glob like for yarn outdated)
Change the lockfile syntax to a YAML-like file format (YAML-like because we would just slightly tweak our parser to emit something YAML-compatible, but we wouldn't support any advanced YAML feature)
Remove the code that deals with multiple registries (yarn / npm, cf src/registries). The yarn registry being a mere mirror of the npm registry, this doesn't make sense.

Anything else you think would be important to ship during this version? Especially in term of breaking changes requiring a semver-major bump?

Answer 1 · 2018-05-03T09:05:23.000Z

About the YAML lockfile - I think it's great. Maybe we can do some revamping of the format while we're there? eg. lose shasum in favor of integrity?
Actually, maybe we can implement this: https://github.com/yarnpkg/rfcs/blob/master/accepted/0000-registry-url-in-lock-file.md
What does everyone think?

Answer 2 · 2018-05-03T09:07:27.000Z

As you can imagine, I like the CLI stuff, that is ripe for a large cleanup for sure. The lockfile change I'm less excited about because I don't really see the clear value that it would bring. Can you explain that a bit more?

What other large features should we consider for a Yarn 2.0 release that we haven't thought of yet?

Answer 3 · 2018-05-03T10:49:32.000Z

Can you explain that a bit more?

It's a bit clunky in term of interop to have our own file format. Sure we have the @yarnpkg/lockfile package to parse/write it, but I feel like it would be better if we didn't have to reinvent the wheel when there's fine solutions outside.

Iirc the initial reasoning behind the custom lockfile format was that JSON isn't convenient to read for humans, and that YAML parsers are too big. I kinda agree with both but:

If size is a concern, then JSON.parse is native and would remove the parser from the equation (750 lines ... we would still keep the conflict resolver so let's say 700). It depends on much it's a concern vs readability.
Not using a YAML parser doesn't mean we can't have a YAML-compatible syntax. We're almost compatible, except for a few quotes and colons here and there. It makes me sad to be so close but so far 🙂

The only reason for me not to change how things currently work is that the lockfile file format is a core component, so we would have to make sure that the 2.0 would be able to convert the old format into the new format, whether it's JSON or YAML. Then we could remove this logic starting from 3.0 or 4.0. It's a bit of work, so maybe it's not worth it, but I still think it's worth being considered 🙂

Answer 4 · 2018-05-03T10:58:00.000Z

Actually, maybe we can implement this: https://github.com/yarnpkg/rfcs/blob/master/accepted/0000-registry-url-in-lock-file.md

Yup, that sounds like a good idea (I have a few comments on the proposal itself, but I'll post it to yarnpkg/rfcs#84).

Answer 5 · 2018-05-03T11:16:55.000Z

I'd also like to remove the part of the code that deals with multiple registries (not talking about registry urls - we have branches for the "yarn registry" and for the "npm registry", which doesn't make sense since the yarn registry is effectively the npm registry).

Looking at the code, I think it was originally used to have a different configuration between npm and yarn, but I haven't seen it used anywhere, even after a quick search on Github, and it complexifies a lot the codebase (especially since it isn't actually implemented everywhere and isn't well tested, so I'm not even sure it works ...).

Answer 6 · 2018-05-03T12:26:52.000Z

I feel like there isn't a good enough justification to change the lockfile format: it would churn all existing lockfiles because we believe aligning it with YAML is a good thing. In hindsight we should have probably gone with YAML but now the cost of changing doesn't seem worth it if the upside is only hypothetical.

Answer 7 · 2018-05-03T12:37:29.000Z

The lockfile change I'm less excited about because I don't really see the clear value that it would bring.

I believe the complaint from users is that @yarnpkg/lockfile is nice to have, but only if you are implementing a JS utility. I remember someone saying that they wanted to write something in Python that does some processing of the lockfile.

This might be a good opportunity to deprecate link: dependency type. It's a bit flakey to actually use (if the package you link to has a node_modules dir then node will resolve in there, instead of in the parent package that is including the link) and I think the intent behind it has been supplanted by workspaces. Alternatively, we instruct people to use yarn link instead of link: dependencies.

As we did with yarn v1, I think we should include all the high-priority tagged issues and try to resolve them (there are 35 currently).

Answer 8 · 2018-05-03T14:18:56.000Z

Also, we could use a GitHub "Project" to track v2 tasks, like we did for v1. I thought that worked well.

Answer 9 · 2018-05-03T14:21:09.000Z

Oh I've got one to add to the list: automated build & release, and defining a release cadence. Right now it feels sort of random of when we release a new version, and it leads to a lot of "when will this PR be released" questions.

It's be nice to do something like "on the first of every month we will release a bug fix increment"

Answer 10 · 2018-05-03T14:34:19.000Z

We have the capability to have automated releases, I was just worried about releases being unstable. Maybe it's not so bad though. Most parts of the release are already automated, the only piece we're missing is automatic tagging of releases, and automatic changelogs. :) Sent from my phone.

…

On Fri, May 4, 2018, 12:21 AM Jeff Valore ***@***.***> wrote: Oh I've got one to add to the list: automated build & release, and defining a release cadence. Right now it feels sort of random of when we release a new version, and it leads to a lot of "when will this PR be released" questions. It's be nice to do something like "on the first of every month we will release a bug fix increment" — You are receiving this because you are on a team that was mentioned. Reply to this email directly, view it on GitHub <#5773 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAFnHWXtOcIS3P9Ozu-zG2tN1qbXMsIRks5tuxJigaJpZM4TvnPl> .

Answer 11 · 2018-05-03T14:40:53.000Z

There's also this that I really want to address: #4147 and #4379.

I'll share my proposal on this soon on that ticket.

The lockfile change I'm less excited about because I don't really see the clear value that it would bring.

The custom format of the lockfile is a constant source of confusion for people. It is not well-documented, is not supported by syntax highlighters (they treat it as YAML), and we need to maintain our own parser package which we don't really do a good job of.

If the lockfile was a limited subset of YAML it may resolve most of the issues I listed above. @imsnif's suggestion is also about the lock file but more about its contents rather than the format.

What other large features should we consider for a Yarn 2.0 release that we haven't thought of yet?

I think 2.0 is about breaking changes, not necessarily large changes.

Answer 12 · 2018-05-04T15:18:58.000Z

Has yarn stabilized to the point where it could expose an API (#2740)? It's not a breaking change per se, more the opposite.

Answer 13 · 2018-05-04T18:07:15.000Z

Definitely +1 for cleaning up command line arguments as they're really confusing. As a user I understand @cpojer 's concern about having to basically recommit your entire lockfile with the upgrade but I think dropping the customer parser and exposing it as YAML api would be better for future's sake.

Few other thoughts I had in mind:

Come up with a documentation-driven model for development in Yarn -- it's hard to sync between code changes, cli options, and docs which live on a separate repo. Maybe start with every PR needs an accompanying docs change unless it's something like a chore?
A new command yarn run-parallel that allows you to run multiple commands together like yarn run-parallel lint test. Also possibly implement the --pattern flag to run commands, e.g. yarn run-parallel --pattern build?
Yarn check has not been kept to up-to-date with the new features of yarn and there seems to be confusion/lack of documentation about --verify-tree, --integrity, etc. #2287

Answer 14 · 2018-05-05T04:05:14.000Z

For documentation, one potential idea is that we could keep CLI command documentation in the code itself, and extract that documentation when building the site. Having code and docs in the same place may help with keeping it up to date, since it could be done in the same diff? Sent from my phone.

…

On Sat, May 5, 2018, 4:07 AM kaylieEB ***@***.***> wrote: Definitely +1 for cleaning up command line arguments as they're really confusing. As a user I understand @cpojer <https://github.com/cpojer> 's concern about having to basically recommit your entire lockfile with the upgrade but I think dropping the customer parser and exposing it as YAML api would be better for future's sake. Few other thoughts I had in mind: 1. Come up with a documentation-driven model for development in Yarn -- it's hard to sync between code changes, cli options, and docs which live on a separate repo. Maybe start with every PR needs an accompanying docs change unless it's something like a chore? 2. Allow a new command yarn run-parallel that allows your run two commands together like yarn run-parallel lint test. Also possibly implement the --pattern flag to run commands, e.g. yarn run-parallel --pattern build? 3. Yarn check has been not kept to up-to-date with the new features of yarn and there seems to be confusion/lack of documentation about --verify-tree, --integrity, etc. #2287 <#2287> — You are receiving this because you are on a team that was mentioned. Reply to this email directly, view it on GitHub <#5773 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAFnHSOeWSP7Cd2xul9yrkiWp26KSWibks5tvJjhgaJpZM4TvnPl> .

Answer 15 · 2018-05-05T05:27:29.000Z

Lockfile format should not be a breaking change in v2.0. New format can be introduced but current one should be also supported. Otherwise it would cause churn in large teams as all engineers possibly will not upgrade Yarn in their machines at the same time.

Clear migration path should be provided.

Or maybe opt-in new format should be implemented in current v1.x releases with deprecation message for current one.

Answer 16 · 2018-05-05T20:38:34.000Z

Agreeing with the sentiment that the v1 lockfiles should still be usable with v2, but there's another option that I'd also be very content with.

A separate package could be created to migrate lockfiles from v1 to v2 (and possibly later from v2 to v3).

This would allow Yarn v2 to remove all logic for the old v1 lockfiles while still making it very easy for developers to update to v2.

Just throwing a ball.

Answer 17 · 2018-05-05T20:52:15.000Z

Don't worry: whatever happens, v1 lockfile would still be readable in v2 🙂 We simply would convert them on the fly to the new format the next time you run yarn install (except when lockfile modifications are locked by one of the command line options).

Answer 18 · 2018-05-05T21:08:07.000Z

What if person A installs v2, updates the lockfile to new format. Then person B checks out the repo and runs “yarn install” with v1 on v2 lockfile?

Answer 19 · 2018-05-05T21:18:36.000Z

I don't think we can do much about this. It's the same thing as if someone starts using workspaces while their colleages are still using an old Yarn release, or npm, or a different package manager, etc. That's a reason why we recommend checking in the Yarn version in the repository, and using the yarn-path configuration settings to force this version to be used.

Answer 20 · 2018-05-06T05:33:00.000Z

I don't think this'd really be a user-facing thing, but it'd be nice if the repo was reformatted a bit. Like, make @yarnpkg/lockfile it's own subproject in the repo that yarn requires instead of another bundle that it spits out. It might also pave the way for exposing more of the API.

Answer 21 · 2018-05-06T05:48:38.000Z

Checking in Yarn in private projects might work. But what about people who contribute to multiple open source projects which use v1 and v2 both? Do they need to then switch manually their Yarn version or create some “nvm” like solution?

Obviously this might not be a issue at all just expressing my thougths.

Answer 22 · 2018-05-06T08:23:31.000Z

It would be solvable if we release a minor which could read, but wouldn’t write the new format. Everyone on 1.x who didn’t upgrade to 2 yet then will be able to read both formats

Answer 23 · 2018-05-06T11:57:14.000Z

But what about people who contribute to multiple open source projects which use v1 and v2 both? Do they need to then switch manually their Yarn version or create some “nvm” like solution?

They would just check in the version of Yarn they use for each project inside all of those repositories, and put a different yarnrc file into each project, that would point to the location of their copy of Yarn (still using yarn-path). Whatever global Yarn they have would then delegate its calls to the local one.

It would be solvable if we release a minor which could read, but wouldn’t write the new format.

That's would slightly extend the window of forward compatibility, but people using Yarn 1.6 still wouldn't be able to read those files, for example. We unfortunately don't have stats regarding which versions of Yarn are in use, so I don't know how frequently people update 🤔

Answer 24 · 2018-05-06T17:34:24.000Z

Regarding scenario lockfile V1 -> (switch to Yarn 2) -> V2 -> (switch to Yarn1) V1.

We could backport a migration V2 -> V1 into Yarn V1.
Yarn V1 should check version if Lockfile, so we should avoid unexpected behavior
And what would be the minimal change to become a subset for Yaml, can V2 be forwards compatible?

Answer 25 · 2018-05-06T17:35:56.000Z

Regarding API - I always want it.
We can have many implementations and plugins to Linking phase, for example the feature that creates hardlinks in node_modules for duplicated modules could totally be a plugin.

Answer 26 · 2018-05-06T20:47:06.000Z

No need to worry about v1 not being able to read the new lockfile format. This is exactly why the change is being done in a major release. It is also not safe to assume Yarn will produce the same results across major versions. This is also part of our commitment to semver.

Answer 27 · 2018-05-06T21:41:07.000Z

No need to worry about v1 not being able to read the new lockfile format. This is exactly why the change is being done in a major release. It is also not safe to assume Yarn will produce the same results across major versions. This is also part of our commitment to semver.

Adoption V1->V2 will take time and people will get confused no matter what we do.
Fail-fast approach (like "Yo dawg, update your Yarn, and now I crash") is easier but I think we could be more graceful.
Personally I think semver is a lie :)

Answer 28 · 2018-05-07T13:06:31.000Z

I would love a direct integration with NSP (Node Security Project), ideally. Security is a huge concern but thus far NPM and Yarn have implemented lockfiles and package verification to handle man in the middle attacks. But vulnerable/compromised packages still require manual setup for every project.

Answer 29 · 2018-05-07T13:06:58.000Z

We unfortunately don't have stats regarding which versions of Yarn are in use, so I don't know how frequently people update 🤔

@arcanis Perhaps we could get User-Agent stats from CloudFlare, if they can provide raw access logs. If I remember correctly, we include the Yarn version number in the user-agent. That won't tell us all Yarn versions in use (as many companies would be using it with a local mirror and thus never actually hit the public registry) but it'd at least give us some rough numbers.

Answer 30 · 2018-05-08T00:43:13.000Z

Maybe we can also address #3630 - if we are going to change the lockfile format, we could consider also adding the hoist location so that we can stop resolving devDependencies during --production installs.

Answer 31 · 2018-05-08T06:28:11.000Z

As this will be a new major version, was is it ever discussed to maybe adopt parts of pnpm's behavior? I'm just curious. Thank you for your work, everyone.

Answer 32 · 2018-05-09T15:01:56.000Z

How about using symlinks inside node_modules instead of copying all the dependencies to node_modules? This can save a lot of disk space and prolong the life of SSD.

Answer 33 · 2018-05-09T15:11:17.000Z

Not quite yet, no. Symlinks (and hardlinks) have some portability issues and do not behave better in every case, so right now it's not a priority. That being said I plan to review our linking process soonish (probably post-2.0), and it might be revisited then.

Answer 34 · 2018-05-10T06:18:30.000Z

Any movement on minor changes like this 🙃 #5625
Also, any effort to improve speeds around offline cache would be really nice.
Second a clear migration path via a codemod or script/guide that is easy to follow if necessary.

Answer 35 · 2018-05-10T09:15:48.000Z

Hi, this will probably not be a popular request here but as a Yarn user, I'd appreciate closer "UX compatibility" with npm. For example, some command line switches are different between Yarn and npm, running scripts can produce different results (and 1.6.0 has been seriously broken in Git Bash), etc.

The goal here is that as long as I don't want to use Yarn-specific features like workspaces (which are awesome BTW), there is no mental overhead for me switching between the two.

Not sure how actionable this is but I thought that this is a good issue to mention it.

Answer 36 · 2018-05-10T09:29:00.000Z

Also, I'm sometimes wondering what you guys think about the npm client and its recent advancements (last year in review and their plans for the future).

When Yarn came out, it was sorely needed and just the speed improvements alone were game-changing. Plus, npm's locking model was either absent or seriously broken for a long time. But at this point, the two projects really seem and feel almost the same, apart from a couple of differences like workspaces or npx.

I don't have any insight into whether the teams like / dislike each other, whether Yarn is of some strategic importance to Facebook or whether server-side registries play any major role but it almost feels like the two projects could start thinking how to unite, perhaps?

Answer 37 · 2018-05-10T10:16:42.000Z

Note that I think this discussion should be in its own thread rather than on the WG for the v2 😉

Since we're multiple maintainers from multiple horizons I can only speak from my own personal perspective, which is that I'm happy for them to see their tool being actively maintained again. Still, I'm of the opinion that each project has its own strengths, goals, and policies, and there would be little point into merging them.

An interesting data point is that Yarn now accounts for about 40% of all the requests to the npm registry, and is constantly growing (~20% since the v1 iirc?). I think it bears witness that there's a need for alternative projects - whether it's Yarn or other interesting projects like pnpm.

Overall, I personally don't care much about competition, which is why you'll rarely see me compare Yarn and npm on Twitter - let's all try to make the best projects we can, and everything will fall into place!

I don't have any insight into whether the teams like / dislike each other

As far as I can tell we don't dislike each other, the npm folks have even collaborated with us on a few PRs, and some maintainers here also contributed back to npm 🙂

Answer 38 · 2018-05-10T13:03:06.000Z

How about using symlinks inside node_modules instead of copying all the dependencies to node_modules?

Copy-on-Write is probably better than symlinks or hardlinks, but requires a file system that supports it (such as btrfs). Copy-on-write reuses the same bytes on the disk when the files are copied (like a hardlink), but a copy is created if you write to the file (so directly editing a file in one project's node_modules won't affect the global cache).

Symlinks might not be easily doable since they tend to require userland support (and not all tools support them), but hardlinks or CoW should work as they're transparent to the app.

Answer 39 · 2018-05-23T20:59:14.000Z

We've been talking over at #5654 that projects having both a yarn.lock file and a package-lock.json file are probably doing something wrong. In this context, it would be interesting to throw an error when it happens, at least on CI.

I remember @BYK is a fierce advocate of frozen-lockfile being pushed a bit more than it currently is. Maybe those two rules would be a good basis for an official CI mode?

Would be enabled by default if the CI environment variable is set (maybe others)
Could be enabled manually with --ci (should we make it disableable?)
Would print an info message linking to the documentation and explaining the CI mode
Would set --frozen-lockfile
Would upgrade warnings into errors (all of them? some of them?)

What do you think? This would give Yarn a standardized way to enforce correctness where it matters the most.

Answer 40 · 2018-05-24T11:24:11.000Z

I like your proposal @arcanis. I think the major concern around having a CI mode was having two different operating modes without a clear signal. If people are okay with that and if we can make our logging system a bit more structured with proper levels to silence this on internal systems like ours to reduce noise, I'm game for this change.

That said I still think change the package.json file and running yarn install without a clear intention to update the lock file is a bit unsafe and potentially confusing.

Answer 41 · 2018-05-25T18:03:30.000Z

Would like to lobby for #3330 for 2.0, too

Answer 42 · 2018-05-26T11:06:19.000Z

@nevir Did you knew something we didn't? 😆

Btw @yarnpkg/core, I opened a github project to reference what we want to do: https://github.com/yarnpkg/yarn/projects/4
I think a few things discussed here are still missing in the project, I'll add them later.

Feel free to assign yourself any task you'd like to work on (this also applies to non-core contributors! This is a great opportunity to do impactful things for Yarn, so please ping me and I'll do my best to help you get started!). I think I'll release a 1.7.1 next tuesday, then we'll be able to freeze for a month or two the time to implement all this.

Answer 43 · 2018-05-28T13:05:21.000Z

not sure if it is too late, or is this idea better suited for 2.0 or a separate RFC... figure I will throw it out here first:

on the high level: hoisting has caused a lot of confusion and tool compatibility issues, for all package managers actually, we can provide a new hoist scheme (let's call it transparent hoisting for now), which achieve redundancy reduction transparently (via OS feature such as hardlink) instead of moving/consolidating modules around like the current hoisting scheme. We will get even better optimization (for example, right now only 1 version of the module can be hoisted, but with hardlink, all versions can be "hoisted") with much more intuitive module graph and better 3rd-party tool compatibility.

Sure hardlink might not be as portable as we would like today, but we don't always have to go to the lowest denominator... we could offer this feature, in parallel with the current hoisting scheme, then gradually improve its coverage/compatibility without holding back the majority of the community.

We already have the hardlink capability, but it still works under the hoist scheme. The current hoisting logic can be greatly simplifed as most of the complexitity is not needed. This feature might sound radical, but I think we already have most pieces available, just need a new way to assemble them...

Answer 44 · 2018-06-13T20:05:27.000Z

I think this should be refactored. Can use async generator function + for await (const step of this.installSteps()) { } (for ex) - or, an async function + generator function that yields these promises;

What you guys think? @rally25rs @arcanis

Answer 45 · 2019-02-07T14:51:35.000Z

v2 plans are detailed here: #6953, closing this issue 🙂