yarnpkg/yarn

Remove hostnames from the lockfiles

arcanis opened this issue · 19 comments

We want to remove hostnames from the lockfiles:

wrappy@1:
  version "1.0.2"
  resolved "https://registry.yarnpkg.com/wrappy/-/wrappy-1.0.2.tgz#b5243d8f3ec1aa35f1364605bc0d1036e30ab69f"

Would become:

wrappy@1:
  version "1.0.2"
  resolved "/wrappy/-/wrappy-1.0.2.tgz#b5243d8f3ec1aa35f1364605bc0d1036e30ab69f"

This will allow us to switch the default registry more easily if we need to (for example if we want to deprecate our mirror, or if npm suddenly disappear from the surface of the earth), and will also make it easier for the users to switch from a registry to another.

@arcanis this idea is being tracked in the following RFC - yarnpkg/rfcs#64

What about custom registries/repositories? If a project has a mix of different registries (eg. some from regular public npm, some from a private internal server), how would it know which is which? Particularly since there could be overlap in package names.

@Daniel15 I think in this case it is recommended to keep the packages coming from a private internal server behind a specific scope, so that you can use scope-specific registry urls.

Adding a datapoint here for @arcanis.

At least for VSTS/TFS, this pattern seems to work.

My yarn.lock matches the pattern for upstream packages:

typescript@~2.8.3:
  version "2.8.3"
  resolved "<INTERNAL_REGISTRY>/typescript/-/typescript-2.8.3.tgz#5d817f9b6f31bb871835f4edf0089f21abe6c170"

Be sure to keep scoped packages in mind - there is an extra segment:

"@types/estree@0.0.38":
  version "0.0.38"
  resolved "<INTERNAL_REGISTRY>/@types/estree/-/estree-0.0.38.tgz#c1be40aa933723c608820a99a373a16d215a1ca2"

Looks like it'd work for Nexus Repository too - you have a repo section after the server root but the tail of the URL is still the same:

type-check@~0.3.2:
  version "0.3.2"
  resolved "https://<server>/repository/<repo-identifier>/type-check/-/type-check-0.3.2.tgz#5884cab512cf1d355e3fb784f30804b2b520db72"
  dependencies:
    prelude-ls "~1.1.2"
besh commented

Confirming that it looks like it would work for artifactory too.

https://<country>.artifactory.<company>.com/artifactory/api/npm/<repo>/<module>.tgz

Verdaccio follows the same tarball url pattern as main registry using the default configuration, unless a user decides to use a prefix for reverse proxy.

https://<domain:port>/<prefix?>/-/<module>/<module>.tgz

Examples from yarn.lock.

resolved "http://localhost:4873/md5.js/-/md5.js-1.3.4.tgz#e9bdbde94a20a5ac18b04340fc5764d5b09d901d"
# scoped
resolved "http://localhost:4873/@babel%2fparser/-/parser-7.0.0-beta.51.tgz#27cec2df409df60af58270ed8f6aa55409ea86f6"

Unfortunately we discovered that some registries follow entirely different rules (a surprising one being ... NPM Enterprise ...), so we won't be able to just trim the hostname 😞

Maybe we'll end up having a whitelist of domain names that can work this way ... but it's annoying.

What are their rules?

Currently, it's something similar to this (cf pnpm/pnpm#867):

https://npm-registry.compass.com/p/pnpm/_attachments/pnpm-1.9.0.tgz

No idea why they chose to break their convention.

What about a compromise to this?

Allowing for the --registry CLI argument when running both the yarn and yarn add command might be a helpful compromise to resolve this issue.

e.g.,

yarn --registry=http://registry.npm.taobao.org

This would allow the user to specify the registry to be used during the installation.

Another might be to trim the hostname if it matches the standard path structure, and leave it in if the URL differs. This would prevent the need for a whitelist, and instead only a simple regexp or simple parser, but allow for using the registry in the user's configuration as necessary for conforming registries.

The interim solution will be to use sed to point to a different registry during our CI builds, but this is obviously not a desired solution.

sed -i 's#http://private-registry/repository/npm#https://registry.yarnpkg.com#' yarn.lock

(from @victornoel in #2566 )

(cc @arcanis @BYK @doxiaodong @victornoel )

Ginxo commented

I (Redhat productization team) recently created a new npm library called lock-treatment-tool https://www.npmjs.com/package/lock-treatment-tool in order to treat lock files before initialising or installing the npm project. It will treat either package-lock.json/yarn.lock files, remove/replace the registries and integrity hash and then you can use your own registries.
I hope this helps you.

I can't believe this issue has existed for 3 years. (#3330)
Is there any schedule or roadmap for 2.0?

@PMExtra This is the wrong repo for 2.0. The 2.0 repo is here: https://github.com/yarnpkg/berry

Ginxo commented

@PMExtra anyway you have the chance to use this lock-treatment-tool library https://www.npmjs.com/package/lock-treatment-tool
locktt --registry=https://npmregistryurl.com
You can even integrate it with frontend-maven-plugin library https://github.com/kiegroup/appformer/blob/0d8c104f2bdb182ca2768eaa5d3a18e57f6f8d1a/appformer-js/pom.xml#L42 just adding -DnpmRegistryURL=whateverregistryurl
I hope it helps

Thanks @Daniel15 @Ginxo .
I'm building CI/CD in China. But we have some oversea servers. So I want to determine npm registry mirror by GitLab Runner Enironments while building. But we should lock the packages version with yarn.lock.
So I want to override the host in yarn.lock. And I want as less as possible of the extra dependencies.

Ginxo commented

@PMExtra so lock-treatment-tool could be a good solution for you

👀 some engineers use local registry proxies, like Nexus to test locally, improve downloads, and to work offline (assuming packages are already downloaded). I tried passing --registry [url], as well as working with ".yarnrc" and ".npmrc". It seems we'll have to remove the "yarn.lock" file from source control until this is fixed.

This issue is fixed in Yarn 2+, and the fix will not be backported in 1.x. Please check the Migration guide and continue from there.