marvel-nccr/quantum-mobile

cif2cell command does not work

ltalirz opened this issue · 16 comments

Reported by a student of Stefaan Cottenier:

Under some specific circumstances (windows host, otherwise unknown), it seems like the cif2cell binary does not find the ElementData class:
image

Also the output from pip install does not mention the requirements of cif2cell (e.g. pycifrw):
image

Possible explanation: installable cif2cell folder where the pip install command was run?

Here are some statistics on how this problem showed up and on which remedies work, for a group of about 100 students with a variety of hardware and always the same version of VirtualBox and of Quantum Mobile :

  • A large majority does not have any issue after updating cif2cell and restarting the virtual machine.
  • A few of the people who still had issues, turned out to have hibernated their virtual machine rather than rebooting it (which is really necessary to get that that cif2cell update effective).
  • Of those who still had issues after a proper reboot, some got rid of all problems by systematically using 'workon aiida', both for the update and for the later execution of cif2cell (this 'workon aiida' is not needed for most others).
  • For those for whom 'workon aiida' does not help, there is still the silly option of adding a dummy '_cod_database_code none' in the cif file, which avoids the error.
  • Only 1 students was left after all of this for whom the error persisted. He experienced very unstable behaviour of the virtual machine anyway (repeatable). It was solved by him by using a different laptop.

The weird thing for me is that there is such a variety in behaviours. Whereas the idea of a virtual machine is that it should behave identically everywhere...

I'll ask students who did not get ride of the problem to post some details of their installation here.

Hello everyone

I am a student from professor Stefaan Cottenier. He asked me to provide some data on the operating system I use and the errors involved in this cif2cell problem.

I am using Windows 10 as my primary operating system, with Ubuntu 18.04 as a dual boot. I installed Quantum Mobile 20.06.1 as a VM on the Windows platform. I already updated the cif2cell in aiida:

cif2cellupgrade

When trying to run the cif2cell with a .cif file without the _database_code None, it was killed instantly:
image

With this _database_code keyword, it works fine. A look at the *.in files reveals no anomalies there, they look OK.

Hope this helps a bit in solving the problem.

Hi @MartijnVandewiele1997 , your second screenshot is for an MPI code - I guess this is not what you wanted to post regarding cif2cell?

No, indeed.

In the next screenshot, you see one input file with the name basicVtryout which has the keyword and one without the keyword, called bacicVryoutwithoutkeyword. Cif2cell gives for both cases no error, the only issue is that if I try to run a pw.x on the first file, it works fine, but on the second file without the database_code None, it fails to do the job (not shown on the screenshot here).

image

Ok, so if I understand correctly, cif2cell actually runs through in both cases; just in one case it produces an input file that Quantum ESPRESSO likes, while in the other case it produces an input file that Quantum ESPRESSO doesn't like.

I believe this issue has nothing to do with Quantum Mobile but is an issue of cif2cell (correct me if I'm wrong).
Could you please open an issue on https://github.com/torbjornbjorkman/cif2cell/issues , attaching your CIF (you may need to rename it to .txt extension) and copy-pasting your comment from here?

Ok, I opened a new issue:
torbjornbjorkman/cif2cell#20

Indeed, this turns out to be not the right illustration of a problem that nevertheless exists (there were other confirmed cases where the cif2cell update has issues).

Right, I was just referring to the problem mentioned by Martijn.

From my side, there are still a couple of points that I would like to figure out:

  1. Why is it necessary to reboot the VM in order to update cif2cell (there should really be no need for this).
    I will check this at some point by importing QM on a Windows host and repeating the commands by myself.
  2. (optional) Where does the "unstable behavior" of the VM come from - is it a problem of software incompatibility with the host operating system or is it just a symptom of actual hardware failure (to which the VM may be more sensitive than the host OS)? This may be difficult to figure out; perhaps @MartijnVandewiele1997 has more pointers on this.

Ok. So for the pw.x issue I am a bit lost... is the failure of pw.x expected because of the incomplete input file?
Is it expected that cif2cell produces an incomplete input file?

I will check this at some point by importing QM on a Windows host and repeating the commands by myself.

This has been observed by several people, this phenomenon is absolutely there.

No doubt! This will just be a way of figuring out whether I can reproduce it myself (and fix it if I can).

In any case, this is a great example of the nightmare that Quantum Mobile was created to avoid.

We are talking here about upgrading a single python package by running one command inside an otherwise controlled environment, and nevertheless people manage to end up with X different results.
It's almost comical :-D

Ok, so the cif2cell issue torbjornbjorkman/cif2cell#20 should then be closed?

I've finally had the time to import Quantum Mobile 20.06.1 on a Windows 10 host (10.0.18363) and try the pip install --upgrade cif2cell command on a fresh VM.

Here is the recording: https://asciinema.org/a/AkwC3XVY9IzCEMocA2LhLvSEV
As one can see, both the cif2cell upgrade as well using cif2cell after the upgrade seem to work fine (no restart needed). Happy to test different inputs as well.

To confirm that it did indeed use the updated cif2cell (version 2.0.0a2):

$ pip show cif2cell
Name: cif2cell
Version: 2.0.0a2
Summary: Construct a unit cell from CIF data
Home-page: http://cif2cell.sourceforge.net/
Author: Torbjorn Bjorkman
Author-email: torbjornb@gmail.com
License: GNU General Public License version 3
Location: /home/max/.local/lib/python3.6/site-packages
Requires: PyCifRW, six
Required-by: 

I did notice one issue, though, that can be confusing: while pip and cif2cell are correctly installed using python 3, the default python version on Ubuntu 18.04 is still version 2.7.

$ pip --version
pip 20.1.1 from /usr/local/lib/python3.6/dist-packages/pip (python 3.6)
$ python --version
Python 2.7.17

That goes back to the point made by @chrisjsewell concerning working in the system python environment - I do think, though, that moving everything to the aiida python environment is not the solution (e.g. there are conflicts when installing cif2cell into this environment).
I believe we are moving to the Ubuntu 20.04 base image for the next release, which resolves the problem of the outdated default python executable.

I’m still surprised that the same command does solve the problem for some people, and not for others. This goes against the philosophy of what a virtual machine should do, Somehow it can still ‘feel’ the bare metal computer it resides in.

@stefaan-cottenier I fully agree - running the exact same commands on a fresh VM installation must yield the same result, independent of the host OS.
Without seeing exactly how you tested, it is impossible for me to know what led to the behavior you observed.
My best guess would be that some command the students ran prior to the installation of cif2cell (and which some might have run in the aiida environment; others in the system python environment) may have led to a weird state of the python installation - from personal experience, python environments are very easy to mess up.

I may be wrong and perhaps there really is some deeper issue rooted in virtual box, but without a reproducible case there is nothing more we can do on the QM side.
I'm closing this issue for now; happy to reopen if there is something to test/improve