icatproject/python-icat

Change from urlparse to urllib.parse

Closed this issue · 5 comments

Today, we've noticed an issue on our GitHub Actions runs for DataGateway:

Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/runner/work/datagateway/datagateway/datagateway-api/util/icat_db_generator.py", line 12, in <module>
    from datagateway_api.common.database import models
  File "/home/runner/work/datagateway/datagateway/datagateway-api/datagateway_api/common/database/models.py", line 24, in <module>
    from datagateway_api.common.date_handler import DateHandler
  File "/home/runner/work/datagateway/datagateway/datagateway-api/datagateway_api/common/date_handler.py", line 2, in <module>
    from icat import helper
  File "/home/runner/.cache/pypoetry/virtualenvs/datagateway-api-di5uZ-3W-py3.9/lib/python3.9/site-packages/icat/__init__.py", line 23, in <module>
    from icat.client import *
  File "/home/runner/.cache/pypoetry/virtualenvs/datagateway-api-di5uZ-3W-py3.9/lib/python3.9/site-packages/icat/client.py", line 13, in <module>
    import urlparse
ModuleNotFoundError: No module named 'urlparse'

As you can see in the traceback, urlparse is being imported in client.py. This was renamed to urllib.parse in Python 3.x, with the name urlparse being dropped from support (I know you know this, but just for the sake of completeness). Hence when trying to import this module using Python 3.7, 3.8 or 3.9, a ModuleNotFoundError is raised. For some reason, there's no error when using Python 3.6 on GitHub Actions, but I'm not sure why. Feel free to view this GitHub Actions run for DataGateway API.

Having investigated this, I've noticed that some branches do use the correct urllib.parse import - develop and any child branches of develop. However, most branches/tags use urlparse - master, any child branches of master and all tags back to 0.17.0. I'm not sure why this is, the blame view shows the last changes on that line was 2 years ago.

For reference, DataGateway API uses 0.20.0 of Python ICAT.

Could this be fixed please?

This should not happen. Something most have gone wrong while installing python-icat.

The reason why all release version still have urlparse in the source code is that they still support Python 2.7. The source is written as Python 2 code and hence it imports the Python 2 standard library modules. When installing with Python 3.* the imports and other Python 2 specific stuff are automatically converted to Python 3 style by running 2to3 during python setup.py build. The upcoming 1.0 release will drop support for Python 2 and this step is already put into effect on the develop branch, see d954c42.

As I see, you are using poetry to install the dependencies of DataGateway in the GitHub Actions. I don't know this tool, but apparently it does not properly run the setup script of the packages it installs.

I have investigated this further and found the issue to be related to the version of setuptools used on GitHub Actions. 58.0.0 of setuptools removed support of the use of 2to3 during builds.

While we wait for Python ICAT 1.0, I have made some workarounds for the CI and and other tooling to DataGateway API.

Are you able to specify a maximum version in setup.py of Python ICAT so other users don't face this issue?

First of all, in any case, I can't retroactively fix published releases. They are, well … released to the wild. I could only fix future releases. But the planned future release 1.0 will not have the issue in the first place, as it doesn't use 2to3 anymore.

And then, python-icat doesn't use setuptools (yet).1 It still uses the plain old distutils from the standard library. But when setuptools gets installed, it boldly replaces distutils with its own version. As we can see in this example, this replacement is incompatible with the standard version. (I have to admit that I don't particularly like PyPA's (the authors of setuptools) attitude.) It's difficult to put a version constraint on a package that you don't use.

Footnotes

  1. This is however destined to be changed in 1.0, see #98.

Yes, good points, I was thinking aloud after spending several hours bug hunting and trying to implement a clean-ish fix. If you're happy, then I'm happy for you to close the issue. Sorry to have submitted an issue when the fault was at my end, but I did learn some new things about Python ICAT :)

No, it was not your fault. If any, it's the setuptools author's fault. Many thanks for the analysis in anyway. I should have known it myself, because I already stumbled over the same issue before.

I added a warning about the issue to the README. Obviously this is somewhat inconsistent, as the README having the warning is on the develop branch which is not affected and the README in the affected releases cannot be changed anymore. I'm not sure if I'll retain that note when 1.0 will be released, as the README should match the release version then. But at least the warning is on the GitHub top level page now for a while. I guess that is all we can do about it.