markovmodel/PyEMMA

numpy.bool removed from numpy --> AttributeError

mmfarrugia opened this issue · 4 comments

When using the pre-made conda environment for PyEMMA, the numpy version used is new enough to have removed numpy.bool. However, the pyemma code still uses this deprecated and, now, removed attribute on line 37 or covartools.py. See the attribute error occur here when running the "00" pentapeptide tutorial:

`AttributeError Traceback (most recent call last)
Cell In[6], line 39
37 fig, axes = plt.subplots(1, 3, figsize=(12, 3), sharey=True)
38 for ax, lag in zip(axes.flat, [5, 10, 20]):
---> 39 torsions_scores = score_cv(torsions_data, lag=lag, dim=dim)
40 scores = [torsions_scores.mean()]
41 errors = [torsions_scores.std()]

Cell In[6], line 29, in score_cv(data, dim, lag, number_of_splits, validation_fraction)
27 for n in range(number_of_splits):
28 ival = np.random.choice(len(data), size=nval, replace=False)
---> 29 vamp = pyemma.coordinates.vamp(
30 [d for i, d in enumerate(data) if i not in ival], lag=lag, dim=dim)
31 scores[n] = vamp.score([d for i, d in enumerate(data) if i in ival])
32 return scores

File ~/.conda/envs/pyemma-plus/lib/python3.11/site-packages/decorator.py:232, in decorate..fun(*args, **kw)
230 if not kwsyntax:
231 args, kw = fix(args, kw, sig)
--> 232 return caller(func, *(extras + args), **kw)

File ~/.conda/envs/pyemma-plus/lib/python3.11/site-packages/pyemma/util/annotators.py:218, in deprecated.._deprecated(func, *args, **kw)
209 user_msg = 'Call to deprecated function "%s". Called from %s line %i. %s'
210 % (func.name, filename, lineno, msg)
212 warnings.warn_explicit(
213 user_msg,
214 category=PyEMMA_DeprecationWarning,
215 filename=filename,
216 lineno=lineno
217 )
--> 218 return func(*args, **kw)

File ~/.conda/envs/pyemma-plus/lib/python3.11/site-packages/pyemma/coordinates/api.py:1440, in vamp(data, lag, dim, scaling, right, ncov_max, stride, skip, chunksize)
1438 res = VAMP(lag, dim=dim, scaling=scaling, right=right, skip=skip, ncov_max=ncov_max)
1439 if data is not None:
-> 1440 res.estimate(data, stride=stride, chunksize=chunksize)
1441 else:
1442 res.chunksize = chunksize

File ~/.conda/envs/pyemma-plus/lib/python3.11/site-packages/pyemma/coordinates/data/_base/transformer.py:215, in StreamingEstimationTransformer.estimate(self, X, **kwargs)
214 def estimate(self, X, **kwargs):
--> 215 super(StreamingEstimationTransformer, self).estimate(X, **kwargs)
216 # we perform the mapping to memory exactly here, because a StreamingEstimator on its own
217 # has not output to be mapped. Only the combination of Estimation/Transforming has this feature.
218 if self.in_memory and not self._mapping_to_mem_active:

File ~/.conda/envs/pyemma-plus/lib/python3.11/site-packages/pyemma/coordinates/data/_base/streaming_estimator.py:44, in StreamingEstimator.estimate(self, X, chunksize, **kwargs)
42 # run estimation
43 try:
---> 44 super(StreamingEstimator, self).estimate(X, **kwargs)
45 except NotConvergedWarning as ncw:
46 self.logger.info(
47 "Presumably finished estimation. Message: %s" % ncw)

File ~/.conda/envs/pyemma-plus/lib/python3.11/site-packages/pyemma/_base/estimator.py:418, in Estimator.estimate(self, X, **params)
416 if params:
417 self.set_params(**params)
--> 418 self._model = self._estimate(X)
419 # ensure _estimate returned something
420 assert self._model is not None

File ~/.conda/envs/pyemma-plus/lib/python3.11/site-packages/pyemma/coordinates/transform/vamp.py:616, in VAMP._estimate(self, iterable, **kw)
612 if self.logger_is_active(self.loglevel_DEBUG):
613 self.logger.debug("Running VAMP with tau=%i; Estimating two covariance matrices"
614 " with dimension (%i, %i)" % (self.lag, indim, indim))
--> 616 covar.estimate(iterable, **kw)
617 self.model.update_model_params(mean_0=covar.mean,
618 mean_t=covar.mean_tau,
619 C00=covar.C00
,
620 C0t=covar.C0t
,
621 Ctt=covar.Ctt
)
622 self.model._diagonalize()

File ~/.conda/envs/pyemma-plus/lib/python3.11/site-packages/pyemma/coordinates/data/_base/streaming_estimator.py:44, in StreamingEstimator.estimate(self, X, chunksize, **kwargs)
42 # run estimation
43 try:
---> 44 super(StreamingEstimator, self).estimate(X, **kwargs)
45 except NotConvergedWarning as ncw:
46 self.logger.info(
47 "Presumably finished estimation. Message: %s" % ncw)

File ~/.conda/envs/pyemma-plus/lib/python3.11/site-packages/pyemma/_base/estimator.py:418, in Estimator.estimate(self, X, **params)
416 if params:
417 self.set_params(**params)
--> 418 self._model = self._estimate(X)
419 # ensure _estimate returned something
420 assert self._model is not None

File ~/.conda/envs/pyemma-plus/lib/python3.11/site-packages/pyemma/coordinates/estimation/covariance.py:240, in LaggedCovariance._estimate(self, iterable, partial_fit)
237 Y = Y - self.remove_constant_mean[np.newaxis, :]
239 try:
--> 240 self._rc.add(X, Y, weights=weight)
241 except MemoryError:
242 raise MemoryError('Covariance matrix does not fit into memory. '
243 'Input is too high-dimensional ({} dimensions). '.format(X.shape[1]))

File ~/.conda/envs/pyemma-plus/lib/python3.11/site-packages/pyemma/_ext/variational/estimators/running_moments.py:294, in RunningCovar.add(self, X, Y, weights)
292 assert Y is not None
293 assert not self.symmetrize
--> 294 w, s, C = moments_block(X, Y, remove_mean=self.remove_mean,
295 sparse_mode=self.sparse_mode, modify_data=self.modify_data,
296 column_selection=self.column_selection, diag_only=self.diag_only)
297 # make copy in order to get independently mergeable moments
298 if self.column_selection is not None:

File ~/.conda/envs/pyemma-plus/lib/python3.11/site-packages/pyemma/_ext/variational/estimators/moments.py:906, in moments_block(X, Y, remove_mean, modify_data, sparse_mode, sparse_tol, column_selection, diag_only)
904 sparse_mode = 'dense'
905 # sparsify
--> 906 X0, mask_X, xconst = _sparsify(X, sparse_mode=sparse_mode, sparse_tol=sparse_tol)
907 Y0, mask_Y, yconst = _sparsify(Y, sparse_mode=sparse_mode, sparse_tol=sparse_tol)
908 is_sparse = mask_X is not None and mask_Y is not None

File ~/.conda/envs/pyemma-plus/lib/python3.11/site-packages/pyemma/_ext/variational/estimators/moments.py:149, in _sparsify(X, remove_mean, modify_data, sparse_mode, sparse_tol)
146 min_const_col_number = int(min_const_col_number)
148 if X.shape[1] > min_const_col_number:
--> 149 mask = covartools.variable_cols(X, tol=sparse_tol, min_constant=min_const_col_number) # bool vector
150 nconst = len(np.where(~mask)[0])
151 if nconst > min_const_col_number:

File ~/.conda/envs/pyemma-plus/lib/python3.11/site-packages/pyemma/_ext/variational/estimators/covar_c/covartools.py:37, in variable_cols(X, tol, min_constant)
31 from pyemma._ext.variational.estimators.covar_c._covartools import (variable_cols_double,
32 variable_cols_float,
33 variable_cols_int,
34 variable_cols_long,
35 variable_cols_char)
36 # prepare column array
---> 37 cols = numpy.zeros(X.shape[1], dtype=numpy.bool, order='C')
39 if X.dtype == numpy.float64:
40 completed = variable_cols_double(cols, X, tol, min_constant)

File ~/.conda/envs/pyemma-plus/lib/python3.11/site-packages/numpy/init.py:284, in getattr(attr)
281 from .testing import Tester
282 return Tester
--> 284 raise AttributeError("module {!r} has no attribute "
285 "{!r}".format(name, attr))

AttributeError: module 'numpy' has no attribute 'bool'
`

I was able to fix this manually for myself by simply changing this reference from numpy.bool to bool.

I've included the conda list, I did add mdshare to the conda environment as it wasn't already included.

Other Suggestions/Issues
I'd also suggest including mdshare in the pre-made conda environment as it is not straightforward to get them to play nice, I had to clone the pyemma environment in order to have the permissions to install anything else in the environment.

I also ran into an ipywidgets issue using the pre-made environment, but I've ran into that several times before due to the ipywidgets.version_info --> ipywidgets.version change in v8, so I won't include that here but I'd recommend ipywidgets 7.6.5 for the environment if possible because v8 has been extremely unstable in my experience.

OS:
linux x86_64
primarily bash, but some zsh, csh, running on a distributed network so I don't know all the details but I can ask for them if they turn out to be necessary. This seems pretty straightforward though.
Conda packages list:
pyemma-plus-list.txt

I also found this issue for np.float, I am now just going to open up the code file in vscode and find all to replace, but anyone running newer versions of numpy is likely also running into this issue.

for the sake of a google search hitting the keywords, error is:
AttributeError: module 'numpy' has no attribute 'float'

I have this same problem -- took older code which was previously working and now it's no longer working. I made the swap from numpy.bool to bool based on this issue, and while it makes my specific code block run, I think it's now no longer running correctly/as it did previously because the code downstream of it now gives other errors.

Any response from the devs? It seems like this is a serious incompatibility with modern numpy unless I'm missing something.

@davidlmobley
No response from the devs yet, it's been ~ 1.5 months and I know that for tasks other than featurization (like vamp, tica, etc) they have moved to a new package called deeptime and pyEMMA will no longer be maintained so I doubt this will be fixed.

However, you can definitely go in and fix the downstream errors within pyEMMA yourself, that's what I did. Simply use the 'Find & Replace' tool (usually Ctrl + H) to search within the pyEMMA files for any "np.float" and "np.bool" and replace them with "float" or "bool". I do recall that there were also one or two cases where they specified float32 or float64 and I did have to just revise the variable type to be compatible for those individual cases. I haven't had any issues for my featurizations, etc. since I changed these and I also noticed that none of these errors were issues when using deeptime.

Happy to review and merge a PR if you'd like to share your working version, i just don't have to capacity right now to do active development