pythia_datasets module is broken on Binder
Closed this issue · 9 comments
What happened:
When running the cell
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import xarray as xr
from pythia_datasets import DATASETS
from core/xarray/enso-xarray.ipynb
, there is an error at import:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[9], line 4
2 import matplotlib.pyplot as plt
3 import xarray as xr
----> 4 from pythia_datasets import DATASETS
File /srv/conda/envs/notebook/lib/python3.12/site-packages/pythia_datasets/__init__.py:4
1 #!/usr/bin/env python3
2 # flake8: noqa
3 """Top-level module for pythia-datasets ."""
----> 4 from pkg_resources import DistributionNotFound, get_distribution
6 from .datasets import DATASETS, locate
8 try:
File /srv/conda/envs/notebook/lib/python3.12/site-packages/pkg_resources/__init__.py:2191
2187 dist_groups = map(find_distributions, resolved_paths)
2188 return next(dist_groups, ())
-> 2191 register_finder(pkgutil.ImpImporter, find_on_path)
2193 if hasattr(importlib_machinery, 'FileFinder'):
2194 register_finder(importlib_machinery.FileFinder, find_on_path)
AttributeError: module 'pkgutil' has no attribute 'ImpImporter'
What you expected to happen:
Successful import.
Minimal Complete Verifiable Example:
See above or follow https://foundations.projectpythia.org/core/xarray/enso-xarray.html (direct binder link https://binder.projectpythia.org/v2/gh/ProjectPythia/pythia-foundations/main?urlpath=lab/tree/core/xarray/enso-xarray.ipynb )
Anything else we need to know?:
Environment:
- Python version: 3.12 (from
sys.version_info
) - Operating System: binder
- Install method (conda, pip, source): binder
Hi @pdebuyl, thanks for this report. I confirmed the same import error on the Binder just now.
Strange that our nightly build hasn't caught this. We'll take a look.
I've confirmed that the same error gets thrown on the Binder in any notebook that does
from pythia_datasets import DATASETS
The infrastructure team will need to take a look at modernizing the pythia_datasets package which hasn't seen a maintenance release since September 2021.
I'm not sure that there's anythng per se wrong with the pythia_datasets package. It looks to be more related to Python 3.12 and older versions of setuptools
(see, e.g. https://stackoverflow.com/questions/77364550/attributeerror-module-pkgutil-has-no-attribute-impimporter-did-you-mean).
Running the ENSO notebook on Binder just now, the version of setuptools
that is on this image is 65.4.1. The current version is 69.0.3. Manually updating this library in my interactive Binder instance resolved the pythia_datasets import error.
Nevertheless, this seems to expose a Binder-related infrastructure issue, in that existing environments may not be being refreshed.
Good point, thanks @ktyle.
I tried launching the Binder interactively from the links in Foundations after merging #443, expecting that a new image would be built with latest packages since there were new commits on the main branch, but this didn't seem to happen.
Does the Binder only build a new image when the environment file changes?
I'll take a look-see!
@brian-rose @pdebuyl I think this should be resolved now.
Yes, I confirmed that the imports are working on the Binder now, yay!
So my takeaway from this is that the Binder only builds a new image if the environment.yml
file changes, as in #444, but not for just any push to the main branch. I'm trying to figure out if this is a feature or a bug. It does mean that the environment the Binder is using gets stale relative to what we're testing in our nightly builds -- which always uses the latest package versions.
Based on the Binder docs, a new image should be built whenever a commit is pushed ... I'm not sure why that wasn't the case in #443.
Thanks all :-)
(PS: i was testing the notebook when following the Open Science 101 MOOC so this will probably make other people happy).