microsoft/OSSGadget

osshealth: NPM purl convention is incomplete, excludes most packages

0x73746F66 opened this issue · 6 comments

aside: you reference npm.org and as the owner of NPM we might assume you would know that is referencing The National Association of Pastoral Musicians (NPM) not your own NPM

Similarly the issue is regarding npmjs.com that Microsoft owns and would be aware that the naming convention for packages has not been simply <package name> for many years now, it is @<org or namespace>/<package name>

Using the / in a purl breaks the NPM support;

Using docker run --rm -it -e GITHUB_ACCESS_TOKEN=$GITHUB_TOKEN --entrypoint /usr/bin/oss-health ossgadget 'pkg:npm/@microsoft/fast-web-utilities

Produces

WARN  - Error processing pkg:npm/@microsoft/fast-web-utilities: Invalid purl: Does not contain a minimum of a 'type' and a 'name'

It's not just that this doesn't support @microsoft/<package name> it this naming convention is becoming defacto now and a majority off mature packages are using it now

We have our own use case too; pkg:npm/@magda/authentication-plugin-sdk

I saw no closed or open issues mentioning that NPM is essentially broken for most packages, and wondered if perhaps I am missing a feature to make this naming convention operate properly

Thanks @0x73746F66! The npm.org was just a typo, fixed in #419.

Regarding the package namespaces, if I remember correctly, the challenge we had early on was that the PackageURL parser didn't handle the @ and / symbols well - the @ symbol had everything after it parsed as a version. I tested it briefly now, and the library seems to do the right thing, so either I'm not remembering something correctly or a bug has been fixed. Either way, we should support pkg:type/@namespace/name@version as one would expect.

In the meantime, you can encode @ as %40 and / as %2f:

PS C:\dev\OSSGadget> .\oss-download pkg:npm/@microsoft/fast-web-utilities

   ____   _____ _____    _____           _            _
  / __ \ / ____/ ____|  / ____|         | |          | |
 | |  | | (___| (___   | |  __  __ _  __| | __ _  ___| |_
 | |  | |\___ \\___ \  | | |_ |/ _` |/ _` |/ _` |/ _ \ __|
 | |__| |____) |___) | | |__| | (_| | (_| | (_| |  __/ |_
  \____/|_____/_____/   \_____|\__,_|\__,_|\__, |\___|\__|
                                            __/ |
                                           |___/
OSS Gadget - oss-download 0.1.394+1730841246 - github.com/Microsoft/OSSGadget
WARN  - Error processing pkg:npm/@microsoft/fast-web-utilities: Invalid purl: Does not contain a minimum of a 'type' and a 'name'

PS C:\dev\OSSGadget> .\oss-download pkg:npm/%40microsoft%2ffast-web-utilities

   ____   _____ _____    _____           _            _
  / __ \ / ____/ ____|  / ____|         | |          | |
 | |  | | (___| (___   | |  __  __ _  __| | __ _  ___| |_
 | |  | |\___ \\___ \  | | |_ |/ _` |/ _` |/ _` |/ _ \ __|
 | |__| |____) |___) | | |__| | (_| | (_| | (_| |  __/ |_
  \____/|_____/_____/   \_____|\__,_|\__,_|\__, |\___|\__|
                                            __/ |
                                           |___/
OSS Gadget - oss-download 0.1.394+1730841246 - github.com/Microsoft/OSSGadget
INFO  - Downloaded pkg:npm/%40microsoft%2ffast-web-utilities to C:\dev\OSSGadget\npm-%40microsoft%2ffast-web-utilities@6.0.0.tgz

Will track as #420

@0x73746F66 a string like pkg:npm/@magda/authentication-plugin-sdk is not actually a valid Package URL.

Like other URL schemes, certain characters in Package URLs have special meaning - as @scovetta noted, the Package URL spec defines @ as a special character which indicates that the following characters to the right represent a version string. To avoid ambiguity in parsing, the Package URL spec requires that namespace and name string components of a package URL be percent-encoded.

For the examples you provided, the correct forms would be pkg:npm/%40microsoft/fast-web-utilities and pkg:npm/%40magda/authentication-plugin-sdk. The / is not encoded, because Package URL defines the namespace and name as separate URL components, and / is the defined separator between those components.

Does that help clarify the errors you were seeing? I hear you that this is not particularly intuitive, but it's part of how the Package URL spec ensures that Package URLs are unambiguous.

Yes it helps, me
Yes it clarifies, the bug

No this doesn't appear to be the way NPM supports the naming convention

No it doesn't appear that this project adhered to the NPM naming convention

No the other user's will not know how to use this tool with NPM unless they find this issue (which may be closed) because the encoding requirements of this pack is not obvious or documented or hinted by the error message

gfs commented

I made the change to allow raw @ for the namespace is oss-download (see #433) but have not carried that over to oss-health. @pmalmsten Do you have any concerns with adding the same handling to CLI calls to oss-health (not carried over into calls to any lib methods as before).

@gfs Nope, no concerns on our end for oss-health (or any of the CLI frontends). Our only interest in strict conformity with the package URL spec is for lib methods.

gfs commented

@pmalmsten I opened #434 which covers doing this escaping in OSS-Health. For DRY purposes, I've made the namespace escaping method a protected static helper method on the base OSSGadget class used for CLIs, but it is not implicitly called - each CLI needs to call it explicitly only where appropriate, for now that is just OSS-Download and OSS-Health.