pypi/conveyor

URL rewriting: use normalized names

Closed this issue · 6 comments

FRidh commented

Links like e.g. https://files.pythonhosted.org/packages/source/f/flask-common/Flask-Common-0.2.0.tar.gz were initially unavailable with Warehouse but were later re-added. Could this rewriting be improved in that it allows you to use the normalized names everywhere, so also for the files. In this case, that would mean the following should return a file

https://files.pythonhosted.org/packages/source/f/flask-common/flask-common-0.2.0.tar.gz

Both the project name and the package name have normalized names.

Thanks for your note, @FRidh, and sorry for the slow response!

Warehouse has been under-maintained for a long time, but the folks working on Warehouse have gotten some limited funding to concentrate on improving and deploying Warehouse, and have kicked off work towards our development roadmap (the most urgent task for Warehouse is to improve it to the point where we can redirect pypi.python.org to pypi.org so the site is more sustainable and reliable), and we've been making progress on replying to some older issues.

I'm a little confused by your question so please forgive me as I ask some questions to understand your request better.

Can you talk a little bit more about why you're interested in normalized filenames? When I look up the downloadable files for Flask-Common right now at https://pypi.org/project/Flask-Common/0.2.0/#files , I see https://files.pythonhosted.org/packages/8c/f6/9898dec0d36dbd66494a2f3626998a9d9085a480158cc0239063a93a58cf/Flask-Common-0.2.0.tar.gz. The difference in the filename is the capitalisation -- you'd prefer the filenames didn't have any capital letters?

Filenames are not required to be the same as project names and we don't want to give users the impression that they always will be. Please let us know what you're working towards so we can sort things out.

Thanks and sorry again for the wait.

FRidh commented

Filenames are not required to be the same as project names and we don't want to give users the impression that they always will be.

That's unfortunate because requiring them to be the same allows more possibilities for automation. In our package set we determine with the JSON API the latest versions available of packages, and then use the old "url rewriting API" to determine the location of the artifacts (tarballs).

In our expressions we prefer to use the normalized names. The JSON API can handle those fine, but the urls can not.

Works:

https://pypi.io/packages/source/F/Flask-Common/Flask-Common-0.2.0.tar.gz

Does not work:

https://pypi.io/packages/source/f/flask-common/flask-common-0.2.0.tar.gz

Hi, @FRidh and any other readers. Thanks for letting us know about your thoughts, and sorry for the slow response!

In the past month we've substantially improved our Warehouse API reference guide which makes it easier for downstreams to programmatically download artifacts, including wheels.

I've spoken with other Warehouse maintainers and we advise that you use more robust methods of grabbing these artifacts, using our supported APIs and the artifact download links we provide in those APIs, rather than doing filename/project normalization and string concatenation as you've described. We cannot make any guarantees that the URLs to the distributions won’t change some day, so unless you are getting URLs from our API, your download tool will always potentially be brittle.

I'm sorry to have to disappoint you. I hope we can address your needs in other ways. Please see other open API-related issues and tell us more about the problems you're trying to solve, so we can help make sure you get what you need via our supported APIs.

When you mention the old "URL rewriting API" I'm not sure what you mean -- the Simple Project API, or package querying via the XML-RPC API? I'll leave this issue open so you can talk more about what you've been doing so far, and so we can learn from that and update our API docs and other issues.

Thanks and sorry again for the wait and the disappointment.

The URL rewriting API I think is Conveyor (https://github.com/pypa/conveyor) which redirects URLs like https://files.pythonhosted.org/packages/source/f/flask-common/flask-common-0.2.0.tar.gz to the "real" URL location. I'm actually kind of surprised this doesn't work already.

Related issue: pypi/warehouse#1944.

FRidh commented

It's indeed Conveyer, thanks @dstufft.