pyiron/pyiron_atomistics

Trouble with mp-api in our dependencies

Closed this issue · 2 comments

We now include mp-api in our dependency stack to support a structure factory. I ran into trouble where downstream, in packages that depend on pyiron_atomistics, I got errors that a certain method in the maggma package (also managed by materialsproject) couldn't be found. The underlying problem was that the CI was installing a too-old version of maggma that doesn't meet the API expectations of mp-api, and this is possible because mp-api doesn't pin its maggma version at all.

It's not currently clear to me why an older version of maggma is getting installed. In one case, I was able to stop this from happening by removing dependence on pympipool; in another case I was able to stop this from happening by simply pinning the correct version of maggma myself (merged version without the owlready complication here). In the latter case, if I can simply pin the higher version and everything works fine, it's completely bizarre to me that conda wasn't just using that version to start with!

So, I'm left with two issues:

  1. Does anyone understand why the CI might fall back to an older version of maggma when the most recent one works perfectly fine?
  2. My current solution is to pin the version of a package on which I do not directly depend -- right now in my less-used downstream packages like ironflow and the new pyiron_workflow, but in principle such a patch can/should be done here instead. I don't like this as a long term fix -- we don't directly depend on maggma and it shouldn't be us ensuring the API is the right one.

@jan-janssen you maintain the mp-api conda feedstock, does that mean you have a sufficient relationship with the materialsproject folks that a criticism of their versioning practices would be well received compared to such criticism coming from a random stranger (i.e. me)? Because I am academically interested in knowing why the CI installed the old version, but the best technological solution is just for their package to demand the dependencies it needs.

@jan-janssen you maintain the mp-api conda feedstock, does that mean you have a sufficient relationship with the materialsproject folks that a criticism of their versioning practices would be well received compared to such criticism coming from a random stranger (i.e. me)?

In terms of my influence, I guess I am known in the materials science community for maintaining the conda-forge packages, but I do not think this ranks my feedback much higher than general users feedback. So I would suggest you open an issue first and if there is no reaction I can try to get more attention to that particular issue, by either emphasising that the issue also affects the conda-forge packages or by contacting the people I know inside the materials project to reach out to the maintainer of the specific package.

Still before we do so, we should clarify where the error is coming from and what we can do to prevent this error:

  • At the moment pyiron_atomistics depends on mp-api and structuretoolkit while structuretoolkit depends on pymatgen and mp-api depends on maggma which depends on emmet-core which depends on pymatgen. So we might update the pymatgen version in structuretoolkit to a version which is not yet supported by emmet-core, like it was the case in the recent migration until conda-forge/emmet-core-feedstock#126 was merged. We can solve this issue by moving the requirement of mp-api from pyiron_atomistics to structuretoolkit like I started in pyiron/structuretoolkit#91 this results in the error being raised already when we try to update the pymatgen version of structuretoolkit and not only when we we update pyiron_atomistics.
  • The error messages from mamba and conda could be more intuitive, but when the simple solve does not work and the solver comes up with a strange solution, than this is typically an indication that the combination is in principle forbidden, but somewhere in the previous versions some permissions were not set correctly so there is one combination that works. To prevent these inconsistencies in future and keep the dependencies of the conda packages automatically in sync with the pypi packages - which previously required manual modifications - I am currently rolling out a new feature called grayskull update. It uses grayskull https://github.com/marcelotrevisani/grayskull to automatically update the dependencies: conda-forge/emmet-core-feedstock#125 conda-forge/pymatgen-feedstock#139 conda-forge/maggma-feedstock#97 .
  • Finally, the whole conda-forge community is currently busy working on the migration for Python 3.12 so certain pull requests took longer than usual to merge, like the one where I corrected the dependencies for the mp-api package which was open for two days before it was merged conda-forge/conda-forge-repodata-patches-feedstock#564

In summary, I am not sure if the mp-api developers can actually solve the issue, or if it is solved by us just updating all packages to a new version. At least for the pyiron_gpl package this solved the issue pyiron/pyiron_gpl#143 . Now that aws-sam-translator-feedstock updated the pydantic requirements conda-forge/aws-sam-translator-feedstock#88 I am optimistic that the updates for pyiron_contrib should work as well.

While this might sound like a reasonable explanation for what happened, it took me approximately a week of work and over five years of contributing packages to conda-forge to solve this issue and understand what exactly went wrong. So managing dependencies is complex and the easiest way to do it, is to handle one update at a time rather than multiple at once.

So I would suggest you open an issue first

Sounds good, I'll politely suggest they introduce a lower-bound.

but when the simple solve does not work and the solver comes up with a strange solution, than this is typically an indication that the combination is in principle forbidden, but somewhere in the previous versions some permissions were not set correctly so there is one combination that works.

that sounds like a very plausible answer to (1)! Doesn't nail it down in details, but at least the gist of things makes sense.

We can solve this issue by moving the requirement of mp-api from pyiron_atomistics to structuretoolkit like I started in (structuretoolkit #91)[https://github.com/pyiron/structuretoolkit/pull/91] this results in the error being raised already when we try to update the pymatgen version of structuretoolkit and not only when we we update pyiron_atomistics.

or if it is solved by us just updating all packages to a new version.

Super, I really like pushing the mp-api requirement as far upstream as possible (namely structuretoolkit). In that case, even if we need to (hopefully just transiently!) pin maggma, it can be done in exactly one spot. In the long run, hopefully the wider ecosystem completing updates makes such a pin completely unnecessary.

While this might sound like a reasonable explanation for what happened, it took me approximately a week of work and over five years of contributing packages to conda-forge to solve this issue and understand what exactly went wrong. So managing dependencies is complex and the easiest way to do it, is to handle one update at a time rather than multiple at once.

💯 I hear you! I appreciate all the work you do maintaining the various packages -- e.g. I didn't even realize you were the maintainer for owlready2! It took me three days just to pin down where the breaking points were with aws-sam-translator + pydantic and mp-api + maggma, much less come up with any real solution; I would be the last to assert dependency management is a simple process! Using greyskull to reduce the points of human interference sounds good to me.