pyiron/ironflow

Converting ironflow scripts into python/pyiron code and vice versa

Opened this issue · 2 comments

JNmpi commented

I super nice feature would be to convert the graphical workflow into python/pyiron code and vice versa. This topic came also up a couple of times in our discussions. While I definititely support this idea and the many opportunities it opens I also see a couple of fundamental challenges. To start the discussion a few thoughts. To be more specific let us consider the workflow to construct a simple Lammps jobs:

pr = Project('test')

for i in np.arange(1:5):
        job = pr.create.job.Lammps(f'lammps_{i}')
        job.structure = pr.create.structure.bulk('Al', bulk=True).repeat(5)

job.run()

Ideally, the pyiron object "job" would contain the information that is given by the above workflow. This is however only partly true. While it "knows" about the used indices [1,2,3,4] and the atomic structure it does not know anything how these parameters have been generated. The relevant information, i.e. the for loop and the pr.create.structure command are known to the Jupyter notebook but not to the object that is stored by pyiron. While the job object has all the information to reconstruct the project and the data it has no access to the metadata. This is unfortunate, since the metadata reveals much more directly what the creator of the workflow had in mind. Even more important, the metadata provides the full information in a much more compressed form, i.e., rather than only getting a vector of integers one gets the construction scheme that needs only two input parameters (in the abov example min=1 and max=5). The effect is even more dramatic for the structure, where three parameters (string, boolean, integer) are sufficient.

How could we expose the metadata to the pyiron job object? For the structure it would be straightforward: The return object would contain not only the positions, cell etc. but also the generating recipe. In fact, this would be straightforward to implement and would strongly enhance pyiron.

What about native python or numpy objects? A possible solution would be to make them accessible as pyiron objects (very similar to the approach in ironflow where we redefine constructs like a for loop, integer arrays etc.). However, I am not sure how this could be converted in easy to read and intuitive code. A very rough first concept is shown below:

pr = Project('test')

int_vec = pr.numpy.arange(min=1, max=5, steps=1)
for i in int_vec:
        job.counter = i
        job = pr.create.job.Lammps(f'lammps_{i.value}', i)
        job.structure = pr.create.structure.bulk('Al', bulk=True).repeat(i.value)

job.run()

In the above example i is not only an integer but a pyiron object, that contains also its generator function. I am sure that this is not the best solution, but to start the discussion it may be helpful to start with a concrete example.

Ironflow->python/pyiron is absolutely going to be doable. I particularly like the idea in the framework of macros (#46), where we want users to be able to construct a flow, group it together as a macro, generate node-code for that macro, save it to a module, and import/register/use this macro elsewhere.

The reverse direction (python/pyiron->Ironflow) is extremely difficult. The direction you suggest above sounds promising, but if I understand correctly it would involve pyironizing/adding metadata to basically every object in the python environment. I would naively expect this to work, but I am terrified of how much work it would be to set up and maintain such a framework.

ryven author Leon Thomm has some thoughts on exporting ryven flows to code here